Last updated: April 19, 2026
Application No. 18/649,574
POSTURE RECOGNITION SYSTEM, METHOD AND ELECTRONIC DEVICE

Non-Final OA §103
Filed
Apr 29, 2024
Examiner
NGUYEN, KATHLEEN V
Art Unit
2486
Tech Center
2400 — Computer Networks
Assignee
BOE TECHNOLOGY GROUP CO., LTD.
OA Round
1 (Non-Final)
Interview Optional

— +26.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 287 resolved cases, 2023–2026
Examiner Intelligence

NGUYEN, KATHLEEN V View full profile →
Grants 66% — above average
Career Allow Rate
188 granted / 287 resolved
+7.5% vs TC avg
Strong +26% interview lift
Without
With
+26.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 6m
Avg Prosecution
23 currently pending
Career history
310
Total Applications
across all art units
Statute-Specific Performance

§101
2.6%
-37.4% vs TC avg
§103
59.3%
+19.3% vs TC avg
§102
19.6%
-20.4% vs TC avg
§112
16.7%
-23.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 287 resolved cases
Office Action

§103
DETAILED ACTION
This Office Action is in response to the application filed on 04/29/2024, wherein claims 1-20 have been examined and are pending. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) was submitted on 08/19/2024. The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

1.	Claims 1, 4-6, 10, 12-13, 16 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (CN 102800103 – see translation attached) hereinafter Liu, in view of Fang et al. (U.S. 2023/0237801) hereinafter Fang, in view of Doolittle et al. (US 2014/0094307) hereinafter Doolittle.
Regarding claims 1 and 12-13, Liu discloses a posture recognition system, a posture recognition method, and an electronic device, comprising a processor and a memory, wherein the memory is configured to store a program executable by the processor, and the processor is configured to read the program in the memory and perform followings: 
N depth cameras, N processing modules and a control module, wherein one of the processing modules is configured to process a target image captured by one of the depth cameras, and N is an integer greater than 1; the N depth cameras capture target images from different locations, and transmit the target images captured respectively to corresponding processing modules, wherein the target images comprises depth information of a target object; the N processing modules respectively determine a depth map of the target object based on the depth information comprised in the target images received respectively, perform a posture recognition on the target object in the target images in view of the depth map of the target object, to obtain N recognition results corresponding to the N depth cameras respectively, and send the N recognition results to the control module; and the control module combines the N recognition results, and determines a combination result as a recognition result of the target image (Liu [0009], [0011], [0018], [0039]-[0043]: multi-view depth camera with different perspective obtains calibration parameters and depth maps and three-dimensional spatial transformation to obtain a point cloud set based on the calibration parameters and depth maps. Multiple depth maps are transformed and fused to obtain a unified point cloud set, and match each 3D point P in the point cloud set with each surface mesh point V on the human body model to obtain the matching result; [0044]-[0047]: based on the matching result, motion capture is performed according to the human skeleton driven surface model to obtain the tracking results).
Liu does not explicitly disclose a processor, a memory configured to store a program executable by the processor, and N processing modules, the N processing modules respectively obtain N recognition results corresponding to the N depth cameras respectively.
However, Fang discloses a processor, a memory configured to store a program executable by the processor (Fang [0029], [0092]: processor and memory; [0029], [0094], [0106]: machine-readable medium storing computer program), and N processing modules, the N processing modules respectively obtain N recognition results corresponding to the N depth cameras respectively (Fang Figs. 1-2, [0034]: multi-camera environment includes camera array 101 having any number, N of camera 102 and 103. Each camera generates corresponding video include video 140 and 150; [0094]: image processor 1802 and central processor 1101 may include any number and type of image or graphics processors, processing units or modules that provide control; [0034]: player detection modules 106 and 107 to process video from camera 1 and video from camera N, respectively, as in Fig. 1; [0043]: human pose estimation and body points such as ankle points can be determined).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the method and system, as disclosed by Liu, and further incorporate a processor, a memory configured to store a program executable by the processor, machine-readable medium storing computer program, and N processing modules, the N processing modules respectively obtain N recognition results corresponding to the N depth cameras respectively, as taught by Fang, to detect objects with high accuracy and stability (Fang [0005]).

Liu broad discloses perform a posture recognition on the target object in the target images in view of the depth map of the target object, to obtain N recognition results corresponding to the N depth cameras respectively, and send the N recognition results to the control module; and the control module combines the N recognition results, and determines a combination result as a recognition result of the target image as discussed above. 
Doolittle further discloses N depth cameras capture target images from different locations, and transmit the target images captured respectively to corresponding processing modules, wherein the target images comprises depth information of a target object; the N processing modules respectively determine a depth map of the target object based on the depth information comprised in the target images received respectively, perform a posture recognition on the target object in the target images in view of the depth map of the target object, to obtain N recognition results corresponding to the N depth cameras respectively, and send the N recognition results to the control module; and the control module combines the N recognition results, and determines a combination result as a recognition result of the target image (Doolittle Figs. 1-7, [0017]-[0020], [0030]: different depth cameras 14A-14C disposed at different positions to detect depth map of target object; [0026]-[0030]: First depth camera detects first portion of depth data of the object and second depth camera detects second portion of depth data of the object; Fig. 7, [0036]-[0037]: detecting first portion of gesture of object based on first portion of depth data from first depth camera, detecting second portion of gesture of object based on second portion of depth data from second depth camera, and combine the first portion and second portion of detected gesture to yield the combined data in the form of a completed gesture).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the method and system, as disclosed by Liu and Yang, and further incorporate performing a posture recognition on the target object in the target images in view of the depth map of the target object, to obtain N recognition results corresponding to the N depth cameras respectively, and send the N recognition results to the control module; and the control module combines the N recognition results, and determines a combination result as a recognition result of the target image, as taught by Doolittle, to detect object’s gesture with more accuracy and sufficiency (Doolittle [0018], [0030]).

Regarding claim 4 and 16, Liu and Fang and Doolittle disclose all limitation of claims 1 and 13, respectively.
Liu does not explicitly disclose wherein the processing modules and the processor is configured to read the program in the memory and perform following: performing depth map computing and posture recognition computing respectively using different cores. 
Fang discloses wherein the processing modules and the processor is configured to read the program in the memory and perform following: performing depth map computing and posture recognition computing respectively using different cores (Fang [0094]: execution unit EU of image processor 1802 may include logic core or cores that may provide a wide array of programmable logic functions; [0106], [0116]: the processor including processor cores may undertake one or more of the blocks of the example processes in response to program code and instructions; [0111], [0114]: multi-core processor).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the method and system, as disclosed by Liu and Fang and Doolittle, and further incorporate having wherein the processing modules and the processor is configured to read the program in the memory and perform following: performing depth map computing and posture recognition computing respectively using different cores, as taught by Fang, to detect objects with high accuracy and stability (Fang [0005]).

Regarding claim 5, Liu and Fang and Doolittle disclose all limitation of claim 1.
Liu does not explicitly disclose wherein the control module comprises a first sub-module and a second sub-module; the first sub-module is configured to receive the N recognition results, and send the N recognition results to the second sub-module; and the second sub-module is configured to combine the N recognition results to obtain a combination result.  
However, Liu discloses wherein the control module receives the N recognition results and combine the N recognition results to obtain a combination result as discussed in claim 1 above (Liu [0009], [0011], [0018], [0039]-[0043]: multi-view depth camera with different perspective obtains calibration parameters and depth maps and three-dimensional spatial transformation to obtain a point cloud set based on the calibration parameters and depth maps. Multiple depth maps are transformed and fused to obtain a unified point cloud set, and match each 3D point P in the point cloud set with each surface mesh point V on the human body model to obtain the matching result; [0044]-[0047]: based on the matching result, motion capture is performed according to the human skeleton driven surface model to obtain the tracking results).  
Furthermore, Fang discloses the control module comprises a first sub-module and a second sub-module which can be used to perform any desired functions of the system (Fang [0094]: execution unit EU of image processor 1802 may include logic core or cores that may provide a wide array of programmable logic functions; [0106], [0116]: the processor including processor cores may undertake one or more of the blocks of the example processes in response to program code and instructions; [0111], [0114]: multi-core processor. Hence, first sub-module and second sub-module can be used to perform different functions including receiving and sending N recognition results and combining the recognition results to obtain a combination result using the first sub-module of the second submodule).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the method and system, as disclosed by Liu and Fang and Doolittle, and further incorporate having the control module comprises a first sub-module and a second sub-module; the first sub-module is configured to receive the N recognition results, and send the N recognition results to the second sub-module; and the second sub-module is configured to combine the N recognition results to obtain a combination result, as taught by Fang, to detect objects with high accuracy and stability (Fang [0005]).
Furthermore, it would have been an obvious matter of design choice to have first sub-module and the second sub-module perform any desired function including the first sub-module is configured to receive the N recognition results, and send the N recognition results to the second sub-module; and the second sub-module is configured to combine the N recognition results to obtain a combination result as cited in claim 5 since applicant has not disclosed that the claimed features is for any particular purpose.

Regarding claim 6, Liu and Fang and Doolittle disclose all limitation of claim 5.
Liu does not explicitly disclose wherein the first sub-module is further configured to send any one or more of following information to the processing modules: a firmware code used to initialize the processing modules; a notification message used to notify the processing modules to initialize a corresponding depth camera; and a model parameter used by the processing modules to perform the posture recognition on a target image received.
Doolittle discloses the first sub-module is further configured to send any one or more of following information to the processing modules: a firmware code used to initialize the processing modules; a notification message used to notify the processing modules to initialize a corresponding depth camera; and a model parameter used by the processing modules to perform the posture recognition on a target image received (Doolittle [0030]: the system can be initialized by arranging a recognizable and sufficient asymmetric reference simultaneously in sight of all the depth cameras in use. The field of view orientations of the various depth cameras may then be adjusted, parametrically; [0033]-[0035]: a previously trained collection of body models may be used to label particular body part. Each joint of the body part can be assigned various parameters specifying a conformation of the corresponding body part, hence model parameter to perform posture recognition)  
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the method and system, as disclosed by Liu and Yang, and further incorporate having the first sub-module is further configured to send any one or more of following information to the processing modules: a firmware code used to initialize the processing modules; a notification message used to notify the processing modules to initialize a corresponding depth camera; and a model parameter used by the processing modules to perform the posture recognition on a target image received, as taught by Doolittle, to achieve consistency in the topology of the reference as perceived by each of the depth cameras and detect object’s gesture with more accuracy and sufficiency (Doolittle [0018], [0030]).

Regarding claim 10, Liu and Fang and Doolittle disclose all limitation of claim 5.
Liu discloses wherein the second sub-module is further configured to perform any one or more of following: displaying N recognition results; displaying and storing the combination result; and receiving a file comprising model parameters input by a user, and sending the model parameters in the file to a corresponding processing module through the first sub-module.
However, Fang discloses the second sub-module is further configured to perform any one or more of following: displaying N recognition results; displaying and storing the combination result; and receiving a file comprising model parameters input by a user, and sending the model parameters in the file to a corresponding processing module through the first sub-module  (Fang [0092]: memory to store video frames and any data; [0115], [0118], [0124]: display to display video or data of object and movement of navigation features of object).  
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the method and system, as disclosed by Liu and Fang and Doolittle, and further incorporate having the second sub-module is further configured to perform any one or more of following: displaying N recognition results; displaying and storing the combination result; and receiving a file comprising model parameters input by a user, and sending the model parameters in the file to a corresponding processing module through the first sub-module, as taught by Fang, to detect objects with high accuracy and stability (Fang [0005]).

Regarding claim 20, Liu and Fang and Doolittle disclose a non-transitory computer storage medium storing a computer program, when the computer program is executed by a processor, the processor is caused to implement steps of the method according to claim 12 as discussed in claim 12 above.  

2.         Claims 2 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (CN 102800103 – see translation attached) hereinafter Liu, in view of Fang et al. (U.S. 2023/0237801) hereinafter Fang, in view of Doolittle et al. (US 2014/0094307) hereinafter Doolittle, further in view of Koppal et al. (US 2015/0062558) hereinafter Koppal.
Regarding claim 2 and 14, Liu and Fang and Doolittle disclose all limitation of claims 1 and 13, respectively.
Liu does not explicitly disclose wherein the processing modules and the processor is configured to read the program in the memory and perform following: determine the depth map of the target object based on depth information comprised in consecutive M frames of target images; wherein depth information comprised in any two frames of target images in the M frames of target images correspond to different phase information; the M is determined based on shooting parameters of the depth cameras, and the M is an integer greater than 0.  
However, Koppal discloses wherein the processing modules and the processor is configured to read the program in the memory and perform following: determine the depth map of the target object based on depth information comprised in consecutive M frames of target images; wherein depth information comprised in any two frames of target images in the M frames of target images correspond to different phase information; the M is determined based on shooting parameters of the depth cameras, and the M is an integer greater than 0 (Koppal [0007]-[0009], [0047]: capturing a plurality of high frequency phase-shifted structured light images of a scene, generating a time of flight TOF depth image of the scene using the TOF sensor and computing depth map from the plurality of high frequency phase-shifted structured light images; [0041]: the number of phase-shifted images that may be captured depends on the speed of the camera and the TOF sensor, hence M frames of images is based on shooting parameters of the depth camera).  
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the method and system, as disclosed by Liu and Fang and Doolittle, and further incorporate having wherein the processing modules are configured to: determine the depth map of the target object based on depth information comprised in consecutive M frames of target images; wherein depth information comprised in any two frames of target images in the M frames of target images correspond to different phase information; the M is determined based on shooting parameters of the depth cameras, and the M is an integer greater than 0, as taught by Koppal, for more accurate depth maps of targets (Koppal [0006]).

3.        Claims 3, 11 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (CN 102800103 – see translation attached) hereinafter Liu, in view of Fang et al. (U.S. 2023/0237801) hereinafter Fang, in view of Doolittle et al. (US 2014/0094307) hereinafter Doolittle, further in view of Mani et al. (US 2020/0117336) hereinafter Mani.
Regarding claim 3 and 15, Liu and Fang and Doolittle disclose all limitation of claims 1 and 13, respectively.
Liu discloses wherein the processing modules are configured to: input the depth map of the target object and the target images into a posture recognition model, and output a recognition result of the target object (Liu [0009], [0011], [0018], [0039]-[0043]: multi-view depth camera with different perspective obtains calibration parameters and depth maps and three-dimensional spatial transformation to obtain a point cloud set based on the calibration parameters and depth maps. Multiple depth maps are transformed and fused to obtain a unified point cloud set, and match each 3D point P in the point cloud set with each surface mesh point V on the human body model to obtain the matching result; [0044]-[0047]: based on the matching result, motion capture is performed according to the human skeleton driven surface model to obtain the tracking results). 
Liu does not explicitly disclose wherein posture recognition models for posture recognition of different processing modules comprise different model parameters.
Mani discloses wherein posture recognition models for posture recognition of different processing modules comprise different model parameters ([0038]: hand gesture analysis module for analyzing hand gesture data to recognize various hand gesture based on gesture data and depth data. Different hand gesture models can be used; [0044]: different hand gesture recognition modules 232; [0090]: different machine learning models to analyze and determine user’s various hand gestures can be used, which would comprise different model parameters).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the method and system, as disclosed by Liu and Fang and Doolittle, and further incorporate having the processing modules are configured to: input the depth map of the target object and the target images into a posture recognition model, and output a recognition result of the target object; wherein posture recognition models for posture recognition of different processing modules comprise different model parameters, as taught by Mani, for improved detection accuracy and efficiency (Mani [0075]).

Regarding claim 11, Liu and Fang and Doolittle disclose all limitation of claim 1.
Liu does not explicitly disclose wherein the target object is a hand, the target image is a gesture image, and the recognition result is a gesture recognition result; the control module is further configured to: obtain the gesture recognition result, and perform a corresponding gesture interaction operation using the gesture recognition result.  
However, Mani discloses wherein the target object is a hand, the target image is a gesture image, and the recognition result is a gesture recognition result; the control module is further configured to: obtain the gesture recognition result, and perform a corresponding gesture interaction operation using the gesture recognition result (Mani Figs. 6A-6E, [0103]-[0106]: capturing by cameras hand gestures 606 or 618 and recognize the hand gesture is unfastening screws of a physical object such as a fridge, and translating the first hand gesture into a first operation and displaying an image 604 having fridge 612, hand gesture 614 and information 616. Hence, perform a corresponding gesture interaction operation using the gesture recognition result; [0108], [0116], [0119]: The hand gesture can be a second hand gesture to interact with a particular electronic component, a third hand gesture of swiping left to flip a page or swipe up to zoom in a selected part).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the method and system, as disclosed by Liu and Fang and Doolittle, and further incorporate having wherein the target object is a hand, the target image is a gesture image, and the recognition result is a gesture recognition result; the control module is further configured to: obtain the gesture recognition result, and perform a corresponding gesture interaction operation using the gesture recognition result, as taught by Mani, for improved detection accuracy and efficiency (Mani [0075]).

4.      Claims 7 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (CN 102800103 – see translation attached) hereinafter Liu, in view of Fang et al. (U.S. 2023/0237801) hereinafter Fang, in view of Doolittle et al. (US 2014/0094307) hereinafter Doolittle, further in view of Aldridge et al. (US 9,819,903) hereinafter Aldridge.
Regarding claim 7 and 17, Liu and Fang and Doolittle disclose all limitation of claims 5 and 13, respectively.
Liu does not explicitly disclose wherein the first sub-module and the processor is configured to read the program in the memory and perform following: sending a pulse width modulation signal to a corresponding depth camera using the processing modules, and controlling different depth cameras to perform exposure shooting at intervals using pulse width modulation signals of different depth cameras; or, adjusting a register value of a corresponding depth camera using the processing modules, and controlling different depth cameras to perform exposure shooting at intervals using register values of different depth cameras.  
Aldridge discloses wherein the first sub-module and the processor is configured to read the program in the memory and perform following: sending a pulse width modulation signal to a corresponding depth camera using the processing modules, and controlling different depth cameras to perform exposure shooting at intervals using pulse width modulation signals of different depth cameras; or, adjusting a register value of a corresponding depth camera using the processing modules, and controlling different depth cameras to perform exposure shooting at intervals using register values of different depth cameras (Aldridge Col. 3, lines 5-13, Claim 4: controller generates pulse width modulated signals to modulate the camera exposure time and exposure shooting at intervals such as every 33.33 milliseconds).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the method and system, as disclosed by Liu and Fang and Doolittle, and further incorporate having wherein the first sub-module and the processor is configured to read the program in the memory and perform following: sending a pulse width modulation signal to a corresponding depth camera using the processing modules, and controlling different depth cameras to perform exposure shooting at intervals using pulse width modulation signals of different depth cameras; or, adjusting a register value of a corresponding depth camera using the processing modules, and controlling different depth cameras to perform exposure shooting at intervals using register values of different depth cameras, as taught by Aldridge, for controllable camera exposure to have desired camera exposure and frame rates (Aldridge Col. 3, lines 5-13).

5.      Claims 8 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (CN 102800103 – see translation attached) hereinafter Liu, in view of Fang et al. (U.S. 2023/0237801) hereinafter Fang, in view of Doolittle et al. (US 2014/0094307) hereinafter Doolittle, further in view of Aldridge et al. (US 9,819,903) hereinafter Aldridge, further in view of Kurosawa et al. (US 6,654,060) hereinafter Kurosawa.
Regarding claims 8 and 18, Liu and Fang and Doolittle and Aldridge disclose all limitation of claims 7 and 17, respectively.
Liu does not explicitly disclose the first sub-module and the processor is configured to read the program in the memory and perform following: determine a shooting interval duration of each of N depth cameras, and send the shooting interval duration to a corresponding processing module, so that the corresponding processing module determines a register value of a corresponding depth camera based on the shooting interval duration received, and sends the register value to the corresponding depth camera.
Kurosawa discloses the first sub-module and the processor is configured to read the program in the memory and perform following: determine a shooting interval duration of each of N depth cameras, and send the shooting interval duration to a corresponding processing module, so that the corresponding processing module determines a register value of a corresponding depth camera based on the shooting interval duration received, and sends the register value to the corresponding depth camera (Kurosawa Col. 5, lines 1-5 and 31-40: image-sensing time, image-sensing condition are registered into a reservation register. The register is stored in the format of a table into a memory device of a controller; Col. 8, lines 59-67, Col. 9, lines 1-40, Fig. 6, the registered information includes image end time and image interval time such as 15 minutes as in Table 6. Hence, determine a register value of a camera based on shooting interval duration; Col. 17, lines 44-46: performing camera control and storing result of image-sensing based on data registered in the register, hence sends the register value to the corresponding camera).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the method and system, as disclosed by Liu and Fang and Doolittle and Aldridge, and further incorporate having the first sub-module and the processor is configured to read the program in the memory and perform following: determine a shooting interval duration of each of N depth cameras, and send the shooting interval duration to a corresponding processing module, so that the corresponding processing module determines a register value of a corresponding depth camera based on the shooting interval duration received, and sends the register value to the corresponding depth camera, as taught by Kurosawa, to improve efficiency and control of the camera (Kurosawa Col. 27, lines 59-65, Col. 1, lines 30-55).

6.       Claims 9 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (CN 102800103 – see translation attached) hereinafter Liu, in view of Fang et al. (U.S. 2023/0237801) hereinafter Fang, in view of Doolittle et al. (US 2014/0094307) hereinafter Doolittle, further in view of Chandler et al. (US 2019/0251702) hereinafter Chandler.
Regarding claims 9 and 19, Liu and Fang and Doolittle disclose all limitation of claims 5 and 13, respectively.
Liu discloses combine the N recognition results to obtain the combination result as discussed in claims 1 and 13 above.
Liu does not explicitly disclose wherein the second sub-module is further configured to: receive N recognition results using a separate sub-thread, and combine the N recognition results to obtain the combination result.  
However, Chandler discloses wherein the second sub-module is further configured to: receive N recognition results using a separate sub-thread, and perform further gesture recognition (Chandler Figs. 42 and 45, [0319]: receiving by a first thread of a processing unit, a set of data including gesture data obtained by the capture device. A second thread perform gesture recognition operation based on the set of the data, hence the second sub-thread receives data and perform gesture recognition; Fig. 55, [0400]-[0404]: the first thread can perform pose estimation using first set of images. The second thread can perform pose estimation using second set of images).  
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the method and system, as disclosed by Liu and Fang and Doolittle, and further incorporate having wherein the second sub-module is further configured to: receive N recognition results using a separate sub-thread, and perform further gesture recognition, which can be combining the N recognition results to obtain the combination result as in Liu, as taught by Chandler, to improve computation efficiency (Chandler [0060]).

7.      Claims 1, 4-5, 12-13, 16 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (CN 102800103 – see translation attached) hereinafter Liu, in view of Fang et al. (U.S. 2023/0237801) hereinafter Fang, in view of Doolittle et al. (US 2014/0094307) hereinafter Doolittle.
Regarding claims 1, 12-13 and 20, Doolittle discloses a posture recognition system, a posture recognition method, and an electronic device, comprising a processor and a memory, wherein the memory is configured to store a program executable by the processor, and the processor is configured to read the program in the memory ([0043]: processor, [0044]-[0045], [0041]: memory and storage to store instruction) and perform followings: 
N depth cameras, N processing modules and a control module, wherein one of the processing modules is configured to process a target image captured by one of the depth cameras, and N is an integer greater than 1; the N depth cameras capture target images from different locations, and transmit the target images captured respectively to corresponding processing modules, wherein the target images comprises depth information of a target object; the N processing modules respectively determine a depth map of the target object based on the depth information comprised in the target images received respectively, perform a posture recognition on the target object in the target images in view of the depth map of the target object, to obtain N recognition results corresponding to the N depth cameras respectively, and send the N recognition results to the control module; and the control module combines the N recognition results, and determines a combination result as a recognition result of the target image (Doolittle Figs. 1-7, [0017]-[0020], [0030]: different depth cameras 14A-14C disposed at different positions to detect depth map of target object; [0026]-[0030]: First depth camera detects first portion of depth data of the object and second depth camera detects second portion of depth data of the object; Fig. 7, [0036]-[0037]: detecting first portion of gesture of object based on first portion of depth data from first depth camera, detecting second portion of gesture of object based on second portion of depth data from second depth camera, and combine the first portion and second portion of detected gesture to yield the combined data in the form of a completed gesture).
Doolitle does not explicitly disclose the N processing modules respectively obtain N recognition results corresponding to the N depth cameras respectively.
However, Fang discloses and N processing modules, the N processing modules respectively obtain N recognition results corresponding to the N depth cameras respectively (Fang Figs. 1-2, [0034]: multi-camera environment includes camera array 101 having any number, N of camera 102 and 103. Each camera generates corresponding video include video 140 and 150; [0094]: image processor 1802 and central processor 1101 may include any number and type of image or graphics processors, processing units or modules that provide control; [0034]: player detection modules 106 and 107 to process video from camera 1 and video from camera N, respectively, as in Fig. 1; [0043]: human pose estimation and body points such as ankle points can be determined).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the method and system, as disclosed by Doolittle, and further incorporate having N processing modules, the N processing modules respectively obtain N recognition results corresponding to the N depth cameras respectively, as taught by Fang, to detect objects with high accuracy and stability (Fang [0005]).


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KATHLEEN V NGUYEN whose telephone number is (571)270-0626.  The examiner can normally be reached on M-F 9:00am-6:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jamie Atala can be reached on 571-272-7384.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/KATHLEEN V NGUYEN/Primary Examiner, Art Unit 2486
Read full office action
Prosecution Timeline

Apr 29, 2024
Application Filed
Mar 07, 2026
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

19/201,161
Patent 12593133
TRACKING SYSTEM
2y 5m to grant Granted Mar 31, 2026
18/781,886
Patent 12587674
BIT DEPTH VARIABLE FOR HIGH PRECISION DATA IN WEIGHTED PREDICTION SYNTAX AND SEMANTICS
2y 5m to grant Granted Mar 24, 2026
18/102,867
Patent 12578680
APPARATUS AND METHOD FOR REPRODUCING HOLOGRAM IMAGE
2y 5m to grant Granted Mar 17, 2026
18/628,686
Patent 12574619
DISPLAY CALIBRATION MECHANISM AND EXTERNALLY-HUNG THERMAL IMAGING DEVICE
2y 5m to grant Granted Mar 10, 2026
18/917,891
Patent 12563232
IMAGE FILE FORMAT FOR MULTIPLANE IMAGES
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
66%
Grant Probability
92%
With Interview (+26.0%)
2y 6m
Median Time to Grant
Low
PTA Risk
Based on 287 resolved cases by this examiner. Grant probability derived from career allow rate.