Last updated: April 18, 2026

Application No. 18/388,749

Systems and Methods for Bystander Pose Estimation for Industrial Vehicles

Non-Final OA §103

Filed

Nov 10, 2023

Examiner

REDA, MATTHEW J

Art Unit

3665

Tech Center

3600 — Transportation & Electronic Commerce

Assignee

The Raymond Corporation

OA Round

3 (Non-Final)

This examiner grants 54% of cases after interview

— +28.5% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.

Based on 231 resolved cases, 2023–2026

Examiner Intelligence

REDA, MATTHEW J View full profile →

Grants 54% of resolved cases

Career Allow Rate

126 granted / 231 resolved

+2.5% vs TC avg

Strong +28% interview lift

Without

With

+28.5%

Interview Lift

resolved cases with interview

Typical timeline

3y 2m

Avg Prosecution

46 currently pending

Career history

277

Total Applications

across all art units

Statute-Specific Performance

§101

8.5%

-31.5% vs TC avg

§103

51.1%

+11.1% vs TC avg

§102

20.8%

-19.2% vs TC avg

§112

15.0%

-25.0% vs TC avg

Black line = Tech Center average estimate • Based on career data from 231 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claims 1-20 are pending and examined below. This action is in response to the claims filed 2/11/26.

	Continued Examination Under 37 CFR 1.114
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 2/11/26 has been entered.
 
Response to Amendment
Applicant’s arguments, see Applicant Remarks Claim Objections filed on 2/11/26, regarding Claim Objections are persuasive in view of amendments filed 2/11/26. Claim Objections are withdrawn.

Applicant’s arguments, see Applicant Remarks 35 USC § 102 and 35 USC § 103. filed on 2/11/26, regarding 35 USC § 102 and 35 USC § 103 rejections are persuasive in view of amendments filed 2/11/26.
However, upon further consideration, new grounds of rejection are made in view of Musk et al. (US 2020/0265247) below.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-6, 8-13, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over D’Ercoli et al. (US 2020/0189103) in view of Musk et al. (US 2020/0265247). 

Regarding claims 1, 8, and 15, D’Ercoli discloses an automated guidance vehicle system for industrial environments including a bystander control system for a material handling vehicle (MHV) having a sensor and a vehicle control system (VCS), the bystander control system comprising (¶16-20 – AGV corresponding to the recited MHV including sensors and a PLC to control operation of the vehicle corresponding to the recited VCS):
 an automation processing system (APS) having a processor and a memory, the APS coupled with the sensor and the VCS and having a machine learning control program stored in the memory, wherein the APS is configured to (¶16-20 and ¶92-94 -  multi-channel virtual sensor 126 corresponding to the recited APS including a processor and a memory with multiple machine learning models corresponding to the recited machine learning control program stored in the memory):
 receive sensor data based on the sensor output indicative of the environment proximate to the MHV (¶16-20 – receiving data from the cameras indicative of the environment proximate to the MHV);
 process the sensor data, using the machine learning control program, wherein the machine learning control program comprises a first trained machine learning model configured to determine a bystander presence and a second trained machine learning model configured to determine a bystander pose (¶19 and ¶36-44 – feature extraction sub-module 316b/316c corresponding to the recited first trained machine learning model which extracts shape information from the sensor data which is used to identify cand classify objects corresponding to the recited determining the presence of a bystander and feature extraction sub-module 316d/316e corresponding to the recited second trained machine learning model which estimates proximity between distinct objects in a scene corresponding to the recited determine bystander pose. Since the feature extraction module 130 may be configured to generate feature data or interaction predictors 131 from multiple machine learning models, feature extraction sub-module 316b/316c and feature extraction sub-module 316d/316e are interpreted as different machine learning models);
 generate, based on the processed sensor data, an output comprising an indication of a control action for the MHV; and send the generated output to the VCS (¶17 - in the event of detecting that a human, robot, or object enters an environment in which the robot 102, which may be stationary or mobile, is operating, the safety PLC 120 may output command data 122 to the robot 102 to slow or stop operation thereof so as to avoid a collision).
While D’Ercoli discloses numerous sub-modules each consisting of different machine learning models to detect, identify, classify, and then determine finer details such as pose, it does not explicitly disclose using a first ML model to determine a bystander presence based on sensor data then using a second ML model to determine bystander pose based on an output from the first ML model, however Musk discloses a system for estimating object properties utilizing visual data in an autonomous vehicle setting including a first trained machine learning model configured to determine a bystander presence based on the sensor data and a second trained machine learning model configured to determine a bystander pose based on an output of the first trained machine learning model (¶30-33 and Fig. 2 – element 205 discloses utilizing a deep learning system with a trained machine learning model for identifying objects such as pedestrians corresponding to the recited first trained machine learning model configured to determine a bystander presence based on the sensor data which is then used in element 207 using related data vision data and its associated identified object data from element 205 to determine ground truths for the identified object including object parameters such as relative distance, direction, velocity, acceleration, etc. corresponding to the recited pose through the use of a trained machine learning model corresponding to the recited second trained machine learning model configured to determine a bystander pose based on an output of the first trained machine learning model);
The combination of the numerous sub-modules each consisting of different machine learning models to detect, identify, classify, and then determine finer details such as pose of D’Ercoli with the explicit multi-phase, multi-model based pedestrian identification and ground truth estimation of Musk fully discloses the elements as claimed.
It would have been obvious to one of ordinary skill in the art before the filing date to have combined the numerous sub-modules each consisting of different machine learning models to detect, identify, classify, and then determine finer details such as pose of D’Ercoli with the explicit multi-phase, multi-model based pedestrian identification and ground truth estimation of Musk in order to improve the accuracy and efficiency of identified objects to maintain vehicle safety (Musk - ¶65).

Regarding claims 2, 3, 4, 9, 10, and 11, D’Ercoli further discloses wherein the first trained machine learning model comprises a bystander detection machine learning model stored in the memory of the APS and the second trained machine learning model comprises a bystander pose estimation machine learning model stored in the memory of the APS  (¶19, ¶36-44, and Fig. 3 - feature extraction sub-module 316b/316c corresponding to the recited first trained machine learning model and feature extraction sub-module 316d/316e corresponding to the recited second trained machine learning model where the feature extraction sub-modules are part of the multi-channel virtual sensor 126 corresponding to the recited APS including a processor and a memory therefore the models are stored in the APS), 
wherein: processing the sensor data comprises: determining, using the first trained machine learning model, whether a bystander is in proximity of the MHV; and determining, using the second trained machine learning model, the pose of the bystander (¶19 and ¶32-44 – feature extraction sub-module 316b/316c corresponding to the recited first trained machine learning model determines blobs/shapes of the objects corresponding to the recited determining whether a bystander is in proximity of the MHV and feature extraction sub-module 316d/316e corresponding to the recited second trained machine learning model estimates proximity between distinct objects in a scene and specific features typically associated with human body parts, such as faces, hands and legs corresponding to the recited the pose of the bystander); and
the APS is configured to generate the output based on the determined bystander pose (¶17 - in the event of detecting that a human, robot, or object enters an environment in which the robot 102, which may be stationary or mobile, is operating, the safety PLC 120 may output command data 122 to the robot 102 to slow or stop operation thereof so as to avoid a collision).

Regarding claims 5 and 12, D’Ercoli further discloses wherein the APS is configured to send an indication of the determined bystander pose to a second MHV (¶19-20 – when risky robot-to-human interactions are determined, the AGV may communicate control data, informational data, classification data, etc. corresponding to the recited determined bystander pose to any or all of the control PLC[s] corresponding to the recited a second MHV given each MHV is controlled by a PLC).

Regarding claims 6 and 13, D’Ercoli further discloses wherein the sensor comprises a first sensor of a first type and a second sensor of a second type, and wherein generating sensor data indicative of the environment proximate to the MHV comprises generating a first sensor data based on the output of the first sensor and a second sensor data based on the output of the second sensor (¶25 – cameras may include 2D cameras, 3D cameras, or any other type of optical or spectral camera corresponding to the recited first sensor type and second sensor type which respectively generate different sensor data of the environment).

Claims 7, 14, and 16-20 are rejected under 35 U.S.C. 103 as being unpatentable over D’Ercoli et al. (US 2020/0189103) in view of Musk et al. (US 2020/0265247), as applied to claims 1 and 8 above, in view of Hashimoto et al. (US 2021/0272434).

Regarding claims 7 and 14, D’Ercoli does not explicitly disclose comparing sensor data to a predetermined map however Hashimoto discloses a smart navigation system including wherein generating the sensor data indicative of the environment proximate to the MHV comprises comparing an output of the sensor to a predetermined map of the environment (¶66 - the detection unit 2 may be configured to detect the presence of the moving object 9 based on a comparison between a detection result Da of the ranging sensor 74 and the map information M of the target area).
The combination of the system for detecting and identifying the presence and pose of a bystander of D’Ercoli in view of Musk with the map discrepancy based moving object analysis of Hashimoto fully discloses the elements as claimed.
It would have been obvious to one of ordinary skill in the art before the filing date to have combined the system for detecting and identifying the presence and pose of a bystander of D’Ercoli in view of Musk with the map discrepancy based moving object analysis of Hashimoto in order to monitor large areas for suspicious persons without labor costs associated with human patrols and blind spots generated with cameras in a known environment (Hashimoto - ¶3).

Regarding claims 16 and 19, while D’Ercoli does disclose detecting and identifying the presence and pose of a bystander, it does not explicitly utilize an identified discrepancy between sensor and map data.
However, Hashimoto further discloses wherein processing the sensor data comprises: receiving, by the first trained machine learning model, an identified discrepancy between the sensor data and a predetermined map of an environment of the MHV (¶118 and Fig. 7 – S2 corresponding to the recited identifying a discrepancy between the sensor data and a predetermined map of an environment of the MHV); and 
determining, by the first trained machine learning model, whether the identified discrepancy is a bystander (¶118-119 and Fig. 7 – S3 corresponding to the recited determining whether the identified discrepancy is a bystander).  
The combination of the system for detecting and identifying the presence and pose of a bystander of D’Ercoli in view of Musk with the map discrepancy based moving object analysis of Hashimoto fully discloses the elements as claimed.
It would have been obvious to one of ordinary skill in the art before the filing date to have combined the system for detecting and identifying the presence and pose of a bystander of D’Ercoli in view of Musk with the map discrepancy based moving object analysis of Hashimoto in order to monitor large areas for suspicious persons without labor costs associated with human patrols and blind spots generated with cameras in a known environment (Hashimoto - ¶3).

Regarding claims 17 and 20, D’Ercoli further discloses wherein processing the sensor data comprises: in response to a determination by the first trained machine learning model that the [object] is a bystander, outputting an indication of the bystander to the second trained machine learning model (¶19 and ¶32-44 – feature extraction sub-module 316b/316c corresponding to the recited first trained machine learning model determines blobs/shapes of the objects corresponding to the recited determining whether a bystander is in proximity of the MHV and feature extraction sub-module 316d/316e corresponding to the recited second trained machine learning model estimates proximity between distinct objects in a scene and specific features typically associated with human body parts, such as faces, hands and legs corresponding to the recited the pose of the bystander where the results of the feature extraction sub-module 316b/316c corresponding to the recited first trained machine learning model is used in the feature extraction sub-module 316d/316e corresponding to the recited second trained machine learning model); and 
in response to a determination by the trained machine learning model that the [object] does not correspond to a bystander, identifying the discrepancy as a potential obstacle and register at least one of a position or a size of the obstacle (¶19, ¶36-44, ¶84-87, and Fig. 3 – each time a decision is to be made, an historical database may be maintained inside an ensemble classifier to update an historical database by updating a row or adding a new row or record in the database where the classifier modules compile the results of the determinations including whether the object is a human or not and its extracted features such as position and size and the resulting features are stored for machine learning purposes).
While D’Ercoli does disclose detecting and identifying the presence and pose of a bystander, it does not explicitly utilize an identified discrepancy.
However, Hashimoto further discloses in response to S3 identifying the discrepancy as a bystander, proceed to S4 to confirm position and behavior of the person corresponding to the recited outputting to the second trained machine learning model to determine pose information (¶118-119 and Fig. 7).
The combination of the system for detecting and identifying the presence and pose of a bystander of D’Ercoli in view of Musk with the map discrepancy based moving object analysis of Hashimoto fully discloses the elements as claimed.
It would have been obvious to one of ordinary skill in the art before the filing date to have combined the system for detecting and identifying the presence and pose of a bystander of D’Ercoli in view of Musk with the map discrepancy based moving object analysis of Hashimoto in order to monitor large areas for suspicious persons without labor costs associated with human patrols and blind spots generated with cameras in a known environment (Hashimoto - ¶3).

Regarding claim 18, D’Ercoli further discloses wherein the APS is configured to: in response to a determination by the first trained machine learning model that the discrepancy corresponds to the bystander, isolate a portion of the sensor data that includes the discrepancy (¶19, ¶36-46, ¶84-87, and Fig. 3 – each time a decision is to be made, an historical database may be maintained inside an ensemble classifier to update an historical database by updating a row or adding a new row or record in the database where the extracted features are saved as training data corresponding to the recited isolate a portion of the sensor data that includes the discrepancy).  
While D’Ercoli does disclose detecting and identifying the presence and pose of a bystander, it does not explicitly utilize an identified discrepancy.
However, Hashimoto further discloses in response to S3 identifying the discrepancy as a bystander, proceed to S4 to confirm position and behavior of the person corresponding to the recited outputting to the second trained machine learning model to determine pose information (¶118-119 and Fig. 7).
The combination of the system for detecting and identifying the presence and pose of a bystander of D’Ercoli in view of Musk with the map discrepancy based moving object analysis of Hashimoto fully discloses the elements as claimed.
It would have been obvious to one of ordinary skill in the art before the filing date to have combined the system for detecting and identifying the presence and pose of a bystander of D’Ercoli in view of Musk with the map discrepancy based moving object analysis of Hashimoto in order to monitor large areas for suspicious persons without labor costs associated with human patrols and blind spots generated with cameras in a known environment (Hashimoto - ¶3).


Additional References Cited
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 

Pilarski et al. (US 2017/0329332) discloses an autonomous vehicle control system which utilizes machine learning models for determining and identifying objects in the environment then further determining objects’ pose and predicted movements (¶51-56).

SCHULTE et al., Autonomous Human-Vehicle Leader-Follower Control Using Deep-Learning-Driven Gesture Recognition, Vehicles, March 9, 2022, 4(1):243-258 discloses a step based object detection, pose estimation, classification system utilizing successive machine learning models (Fig. 5 and pg. 248). 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Matthew J Reda whose telephone number is (408)918-7573.  The examiner can normally be reached on Monday - Friday 7-4 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hunter Lonsberry can be reached on (571) 272-7298.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/MATTHEW J. REDA/Primary Examiner, Art Unit 3665

Read full office action

Prosecution Timeline

Nov 10, 2023

Application Filed

Aug 06, 2025

Non-Final Rejection — §103

Oct 31, 2025

Response Filed

Dec 05, 2025

Final Rejection — §103

Feb 11, 2026

Request for Continued Examination

Mar 03, 2026

Response after Non-Final Action

Apr 01, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

16/334,595

Patent 12573248

AN ELECTRONIC CONTROL UNIT FOR A VEHICLE CAPABLE OF CONTROLLING MULTIPLE ELECTRICAL LOADS

2y 5m to grant Granted Mar 10, 2026

17/675,238

Patent 12570509

INDUSTRIAL TRUCK WITH DETECTION DEVICES ON THE FORKS

2y 5m to grant Granted Mar 10, 2026

18/364,575

Patent 12533065

METHOD AND APPARATUS FOR CLASSIFYING SUBJECT INDEPENDENT DRIVER STATE USING BIO-SIGNAL

2y 5m to grant Granted Jan 27, 2026

17/931,435

Patent 12530029

SYSTEM AND METHOD OF ADAPTIVE, REAL-TIME VEHICLE SYSTEM IDENTIFICATION FOR AUTONOMOUS DRIVING

2y 5m to grant Granted Jan 20, 2026

18/296,412

Patent 12525071

METHOD FOR ASSISTED OPERATING SUPPORT OF A GROUND COMPACTION MACHINE AND GROUND COMPACTION MACHINE

2y 5m to grant Granted Jan 13, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

54%

Grant Probability

83%

With Interview (+28.5%)

3y 2m

Median Time to Grant

High

PTA Risk

Based on 231 resolved cases by this examiner. Grant probability derived from career allow rate.