Last updated: April 19, 2026
Application No. 18/043,593
SKELETON DETECTION SYSTEM

Final Rejection §103
Filed
Mar 01, 2023
Examiner
ABDI, AMARA
Art Unit
2668
Tech Center
2600 — Communications
Assignee
Hitachi Kokusai Electric Inc.
OA Round
2 (Final)
Interview Optional

— -7.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 816 resolved cases, 2023–2026
Examiner Intelligence

ABDI, AMARA View full profile →
Grants 83% — above average
Career Allow Rate
677 granted / 816 resolved
+21.0% vs TC avg
Minimal -8% lift
Without
With
+-7.5%
Interview Lift
resolved cases with interview
Typical timeline
2y 7m
Avg Prosecution
33 currently pending
Career history
849
Total Applications
across all art units
Statute-Specific Performance

§101
9.8%
-30.2% vs TC avg
§103
60.7%
+20.7% vs TC avg
§102
10.2%
-29.8% vs TC avg
§112
10.0%
-30.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 816 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
Applicant’s response to the last office action, filed July 10, 2025 has been entered and made of record. Claims 1, and 7-8 have been amended; claims 3, 5, 9, and 12-13 have been cancelled; and claims 16-21 have been newly added. By this amendment, claims 1-2, 4, 6-8, 10-11, and 14-21 are currently pending in this application.

Response to Arguments
Applicant’s arguments with respect to claims 1-2, 4, 6-8, 10-11, and 14-21 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 4, 6, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Miyake et al, (US-PGPUB 20220414898) in view of Nishimoto et al, (US-PGPUB 20110304540)

In regards to claim 1, Miyake et al discloses a skeleton detection system, 
comprising: 
one or more arithmetic devices, (8 in Fig. 1); and one or more storage devices, (Fig. 1, storage unit”), 
wherein the one or more storage devices store a target image for use in skeleton detection, and wherein the target image is a frame image in a moving image, (see at least: Par. 0043, the storage unit 8B can also temporarily store image information of an image captured by the image capturing unit 2, “image information corresponds to the target image implicitly for use in skeleton detection”. Further, from Par. 0027, the image capturing unit 2, is implicitly video camera providing video stream, and the target image shown in Fig. 3, is implicitly is a frame in a moving image), and 
a plurality of skeleton detection models respectively corresponding to a plurality of skeleton definition models configured to define different skeletons, (see at least: Fig. 3, and Par. 0051-0052, the skeleton model generating unit 8b estimates and generates the skeleton model MDL representing the person P in the image I, where the generated models P1(P) and P2(P) shown in Fig. 3, represent the plurality of skeleton detection models, corresponding to person P1 standing up, and a person P2  sitting down on a seat S, respectively, and where the generated models P1(P) and P2(P) implicitly define different skeletons); and 
wherein the one or more arithmetic devices:
determine a predetermined condition for the skeleton detection of the target image, (see at least: Par. 0055, determination unit 8c of the present embodiment determines the state of the person P corresponding to the skeleton model MDL generated by the skeleton model generating unit 8b, by distinguishing between the person P standing up and the person P sitting down, [i.e., determine a predetermined condition, “person P standing up and the person P sitting down”]. The limitation: “for the skeleton detection of the target image”, is an intended use in the claim);
selecting a first skeleton detection model from the plurality of skeleton detection models based on a result of the determination of the predetermined condition, based on a skeleton detection result in a past frame image in the moving frame, (see at least: Par. 0052, the skeleton model generating unit 8b estimates and generates the skeleton model MDL representing the person P in the image I, from the image I captured by the image capturing unit 2 using various known methods such as a background subtraction method, a mean shift method, pattern matching, various machine learning methods, and the like; and from Par. 0057, the fall prevention system 1 stores in advance in the learned mathematical model for state determination and the like obtained by the machine learning into the storage unit 8B, where the fall prevention system 1, performs machine learning using “whether the person P is standing up, is sitting down, or has fallen over” and the like as objective variables; and from Par. 0055, distinguishing between the person P standing up and the person P sitting down, “one of a condition related to a content of the target image”, [i.e., selecting the first skeleton detection model from the plurality of skeleton detection models, “estimates and generates the skeleton model MDL representing the person P in the image I”, based on a result of the determination of the predetermined condition, “implicitly based on distinguishing between the person P standing up and the person P sitting down, “one of a condition related to a content of the target image”]); and 
executing the skeleton detection of the target image by the first skeleton detection model, (see at least: Par. 0058, the determination unit 8c determines the state of the person corresponding to the skeleton model MDL, by classification and regression based on the learned mathematical model for state determination or the like stored in the storage unit 8B as described above, [implicitly executing the skeleton detection of the target image, “distinguishing between the person P standing up, the person P sitting down, and the person P fallen over”, by the first skeleton detection model, “MDL”]).
Miyake does not expressly disclose selecting a first skeleton detection model from the plurality of skeleton detection models based on a distance between a person in the target image and an imaging device for the target image.
However, Nishimoto et al discloses selecting a first skeleton detection model from the plurality of skeleton detection models based on a distance between a person in the target image and an imaging device for the target image, (see at least: Fig. 2A, and Par. 0162,  the depth information (i.e., information about the distance from the object), may be acquired using a CMOS sensor, distance sensor, or the like; and from Fig. 3, Par. 0164-0165, skeleton information (motion information in a broad sense) used to specify the motion of the operator is acquired based on the depth information shown in FIG. 2B. The three-dimensional shape of the operator or the like viewed from the image sensor ISE can be acquired using the depth information shown in FIG. 2B; and a matching process is performed on the body shape/physique of the operator and the body shape/physique of the plurality of models using the depth information and the color image information about the operator obtained using the image sensor ISE to specify a model having a body shape/physique similar to that of the operator, [i.e., selecting a first skeleton detection model from the plurality of skeleton detection models, “implicit by specifying a model having a body shape/physique similar to that of the operator by performing matching process”, based on a distance between a person in the target image and an imaging device for the target image, “based on the depth information or information about the distance from the object)
Miyake and Nishimoto are combinable because they are both concerned with skeleton detection. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify Miyake, to use CMOS sensor, distance sensor, or the like, as though by Nishimoto, in order to specify the skeleton model having a body shape/physique similar to that of the operator based on the depth information, (Nishimoto, Par. 0165).



In regards to claim 2, the combine teaching Miyake and Nishimoto as whole discloses the limitations of claim 1.
Miyake further discloses wherein the predetermined condition includes at least one of a condition related to a content of the target image, a condition related to an arithmetic resource for the skeleton detection, a condition related to an imaging device configured to capture the target image, a condition related to a content of an image prior to the target image, and a condition designated by a user, (Miyake, see at least: Par. 0055, distinguishing between the person P standing up and the person P sitting down, “one of a condition related to a content of the target image”]).

In regards to claim 4, the combine teaching Miyake and Nishimoto as whole discloses the limitations of claim 1.
Miyake further discloses wherein the one or more arithmetic devices select the first skeleton detection model from the plurality of skeleton detection models based on a detection result of a keypoint in a skeleton detection result in the past frame image, (Miyake, see at least: Par. 0053, skeleton parts of the human body such as the head, eyes, nose, mouth, shoulders, hips, feet, knees, elbows, hands, joints, and the like of the person P are symbolically represented by “points”, and the skeleton model MDL is generated by connecting the points with “lines”; and Par. 0058, the determination unit 8c determines the state of the person corresponding to the skeleton model MDL, by classification and regression based on the learned mathematical model for state determination or the like stored in the storage unit 8B as described above, [i.e., selecting the first skeleton detection model from the plurality of skeleton detection models based on a detection result of a keypoint in a skeleton detection result in the past frame image based on learned skeleton parts of the human body represented by points forming learned mathematical model]).

In regards to claim 6, the combine teaching Miyake and Nishimoto as whole discloses the limitations of claim 1.
Miyake further discloses wherein the one or more arithmetic devices select the first skeleton detection model from the plurality of skeleton detection models based on the number of persons in the target image, (Miyake, see at least: Fig. 3, and Par. 0052, generates the skeleton model MDL representing the person P in the image I, from the image I captured by the image capturing unit 2 using various known methods such as a background subtraction method, a mean shift …., and the like); and Par. 0053, If the person P in the image I is multiple in numbers, the skeleton model generating unit 8b generates multiple skeleton models MDL according to the number of the person P, [i.e., selecting the first skeleton detection model from the plurality of skeleton detection models, “generates the skeleton model MDL representing the person P, using a background subtraction method …pattern matching”, based on the number of persons in the target image, “according to the number of the person P”]).

In regards to claim 19, the combine teaching Miyake and Nishimoto as whole discloses the limitations of claim 1.

Furthermore, Nishimoto et al discloses where the skeleton model is selected as the first skeleton detection model based on a distance between a person in the target image and an imaging device for the target image, (see at least: Fig. 2A, and Par. 0162, and Fig. 3, Par. 0164-0165, “see the rejection of claim 1 for more details”)
However, the combine teaching Miyake and Nishimoto as whole does not expressly disclose wherein a long- distance skeleton model is selected as the first skeleton detection model when the distance is greater than one hundred (100) meters.  
At the time of the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to have the distance greater than one hundred (100) meters. Applicant has not disclosed that having the distance greater than one hundred (100) meters provides an advantage, be used for a particular purpose, or solves a stated problem. One of ordinary skill in the art, furthermore, would have expected Applicant’s invention to perform equally well with either the distance between a person in the target image and an imaging device for the target image, though by Nishimoto, or the claimed distance between a person in the target image and an imaging device for the target image being greater than one hundred (100) meters, because both distances being used to perform the same function of specifying the skeleton model having a body shape/physique similar to that of the operator based on the depth information, (Nishimoto, Par. 0165).

Claims 7, 10, and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Miyake et al, (US-PGPUB 20220414898) in view of Wu et al, (US-PGPUB 20150092978); and further in view of Nishimoto et al, (US-PGPUB 20110304540)
Regarding claim 7, claim 7 recites substantially similar limitations as set forth in claim 1. As such, claim 7 is rejected for at least similar rational.
The Examiner further acknowledged the following additional limitation(s): “wherein the one or more arithmetic devices select the first skeleton detection model from the plurality of skeleton detection models based on a result of the determination, based on a purpose of performing skeleton detection, and based on a distance between a person in the target image and an imaging device for the target image”
However, Wu et al discloses selecting the first skeleton detection model from the plurality of skeleton detection models based on a result of the determination based on a purpose of performing skeleton detection, (see at least: Par. 0021, the online video process module 110 can be used to monitor, capture and extract body skeleton joint data (FIG. 4) from video frames. The offline analysis module 120 saves the captured skeleton frames into files and provides a database management interface for manually initialing and managing a behavior database, “user interface implicitly managing a behavior database for the skeleton detection based on acquired skeleton data”; and from Par. 0022, 0028, the behavior recognition module 130 can be configured to determine if a human behavior detected in the video stream 150 belongs to a type of behavior or behavior classification by using template matching and/or machine learning, [i.e., selecting the first skeleton detection model from the plurality of skeleton detection models “by using template matching and/or machine learning”, based on a purpose of performing skeleton detection designated by a user, “based on managing a behavior database for the skeleton detection, by the user interface”]).
Miyake and Wu et al are combinable because they are both concerned with skeleton model generating. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify Miyake, to use the offline analysis module 120, as though by Wu et al, in order to provide a database management interface for manually initialing and managing a behavior database, comprising the skeleton data, (Par. 0021), to determine the presence of normal behavior and/or abnormal behavior by using a machine learning and/or a template matching process, (Wu, Par. 0028)
Miyake and Wu et al does not expressly disclose selecting the first skeleton detection model from the plurality of skeleton detection models on a result of the determination, based on a distance between a person in the target image and an imaging device for the target image
However, Nishimoto et al discloses selecting the first skeleton detection model from the plurality of skeleton detection models based on a result of the determination, based on a distance between a person in the target image and an imaging device for the target image, (see at least: Fig. 2A, and Par. 0162,  the depth information (i.e., information about the distance from the object), may be acquired using a CMOS sensor, distance sensor, or the like; and from Fig. 3, Par. 0164-0165, skeleton information (motion information in a broad sense) used to specify the motion of the operator is acquired based on the depth information shown in FIG. 2B. The three-dimensional shape of the operator or the like viewed from the image sensor ISE can be acquired using the depth information shown in FIG. 2B; and a matching process is performed on the body shape/physique of the operator and the body shape/physique of the plurality of models using the depth information and the color image information about the operator obtained using the image sensor ISE to specify a model having a body shape/physique similar to that of the operator, [i.e., selecting a first skeleton detection model from the plurality of skeleton detection models, “implicit by specifying a model having a body shape/physique similar to that of the operator by performing matching process”, based on a distance between a person in the target image and an imaging device for the target image, “based on the depth information or information about the distance from the object)
Miyake, Wu, and Nishimoto are combinable because they are all concerned with skeleton detection. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify the combine teaching Miyake and Wu, to use CMOS sensor, distance sensor, or the like, as though by Nishimoto, in order to specify the skeleton model having a body shape/physique similar to that of the operator based on the depth information, (Nishimoto, Par. 0165).

Regarding claim 10, claim 10 recites substantially similar limitations as set forth in claim 2. As such, claim 10 is rejected for at least similar rational.

Regarding claim 14, claim 14 recites substantially similar limitations as set forth in claim 6. As such, claim 14 is rejected for at least similar rational.

Claims 8, 11, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Miyake et al, (US-PGPUB 20220414898) in view of Lang et al, (CN 111444921); and further in view of Nishimoto et al, (US-PGPUB 20110304540)

Regarding claim 8, claim 8 recites substantially similar limitations as set forth in claim 1. As such, claim 8 is rejected for at least similar rational.
The Examiner further acknowledged the following additional limitation(s): “wherein the one or more arithmetic devices select the first skeleton detection model from the plurality of skeleton detection models, to ensure an estimated calculation cost of the skeleton detection falls within a usable calculation resource, and based on a distance between a person in the target image and an imaging device for the target image”.
However, Lang et al discloses select the first skeleton detection model from the plurality of skeleton detection models such that an estimated calculation cost of the skeleton detection falls within a usable calculation resource, (see at least: Page 2, lines 17-18, performing skeleton extraction to each connected region; obtaining at least one framework as the detected scratch, [i.e., selecting the first skeleton detection model from the plurality of skeleton detection models]. Further, Page 16, 6th paragraph, by combining the skeleton extraction algorithm and Hough transform linear detection algorithm, through the result of the division for skeleton extraction, reducing the number of foreground point of Hough transformation input, capable of reducing the processing time of Hough conversion, greatly improving the calculation efficiency. then performing Hough transform linear detection to the extracted skeleton; analyzing the connected region of the straight line detected by the Hough transform, [i.e., selecting the first skeleton detection model from the plurality of skeleton detection models such that an estimated calculation cost of the skeleton detection falls within a usable calculation resource]).

Miyake and Lang et al are combinable because they are both concerned with skeleton model generating. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify Miyake, to combine the skeleton extraction algorithm and Hough transform linear detection algorithm, as though by Lang, in order to reducing the processing time of Hough conversion, greatly improving the calculation efficiency, (Lang, Page 16, 6th paragraph)
However, the combine teaching Miyake and Lang as whole does not expressly disclose select the first skeleton detection model from the plurality of skeleton detection models based on a distance between a person in the target image and an imaging device for the target image
Miyake, Lang, and Nishimoto are combinable because they are all concerned with skeleton detection. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify the combine teaching Miyake and Lang, to use the CMOS sensor, distance sensor, or the like, as though by Nishimoto, in order to specify the skeleton model having a body shape/physique similar to that of the operator based on the depth information, (Nishimoto, Par. 0165).

Regarding claim 11, claim 11 recites substantially similar limitations as set forth in claim 2. As such, claim 11 is rejected for at least similar rational.

Regarding claim 15, claim 15 recites substantially similar limitations as set forth in claim 6. As such, claim 15 is rejected for at least similar rational.

Claims 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Miyake and Nishimoto et al, as applied to claim 1 above; and further in view of Chu et al, (US-PGPUB 20200065991)

In regards to claim 16, the combine teaching Miyake and Nishimoto as whole discloses the limitations of claim 1.
Nishimoto further discloses wherein the distance between the imaging device and the person is estimated based on a size of the person in the target image, (see at least: Par. 0145, the depth information is acquired by emitting light (e.g., infrared radiation) from the image sensor ISE (depth sensor), and detecting the reflection intensity or the time of flight of the emitted light to detect the shape of the object (e.g., operator) viewed from the position of the image sensor ISE, [i.e., the distance between the imaging device and the person is estimated, “estimating the depth”, based on a size of the person in the target image, “the shape of the object or operator”])
The combine teaching Miyake and Nishimoto as whole does not expressly disclose wherein the distance between the imaging device and the person is estimated based on an in-image coordinate.
However, Chu discloses wherein the distance between the imaging device and the person is estimated based on an in-image coordinate, (see at least: Par. 0040, defining the coordinate point of the ankle joint of the virtual foot model F.sub.V in the space coordinate system, which the defined coordinate point corresponds to the ankle joint point P.sub.A of the user. The coordinates of the reference point P.sub.1 can be provided for subsequent calculations for obtaining the whole depth information of the ankle joint, [i.e., wherein the distance between the imaging device and the person, “depth information”, is estimated based on an in-image coordinate, “the ankle joint point”]).
Miyake, Nishimoto, and Chu are combinable because they are all concerned with skeleton detection. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify the combine teaching Miyake and Nishimoto, to define the coordinate point of the ankle joint, as though by Chu, in order to obtain the whole depth information of the ankle joint, based on the coordinate point of the ankle joint, (Chu, Par. 0040).

In regards to claim 17, the combine teaching Miyake and Nishimoto as whole discloses the limitations of claim 1.
The combine teaching Miyake and Nishimoto as whole does not expressly disclose wherein the distance between the imaging device and the person is estimated based on an in-image coordinate of an ankle of the person in the target image.
However, Chu discloses wherein the distance between the imaging device and the person is estimated based on an in-image coordinate of an ankle of the person in the target image, (see at least: Figs. 1A-B, 2, 3A, and Par. 0040, defining the coordinate point of the ankle joint of the virtual foot model F.sub.V in the space coordinate system, which the defined coordinate point corresponds to the ankle joint point P.sub.A of the user. The coordinates of the reference point P.sub.1 can be provided for subsequent calculations for obtaining the whole depth information of the ankle joint, [i.e., wherein the distance between the imaging device and the person, “depth information”, is estimated based on an in-image coordinate of an ankle of the person in the target image, “the ankle joint point coordinate”]).
Miyake, Nishimoto, and Chu are combinable because they are all concerned with skeleton detection. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify the combine teaching Miyake and Nishimoto, to define the coordinate point of the ankle joint, as though by Chu, in order to obtain the whole depth information of the ankle joint, based on the coordinate point of the ankle joint, (Chu, Par. 0040).

Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Miyake and Nishimoto et al, as applied to claim 1 above; and further in view of Lee et al, (US-PGPUB 20210073953)
The combine teaching Miyake and Nishimoto as whole discloses the limitations of claim 1.
The combine teaching Miyake and Nishimoto as whole does not expressly disclose wherein the person comprises a farthest person in the target image.
Lee discloses wherein the person comprises a farthest person in the target image, (see at least: Fig. 7, Par. 0108, the person 720 is the farthest person in an image)
Miyake, Nishimoto, and Lee are combinable because they are all concerned with skeleton detection. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify the combine teaching Miyake and Nishimoto, to select the farthest person 720 in an image, as though by Lee, in order to generate a depth map indicating depth information of pixels in the image, (Lee, Par. 0008)
Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over Miyake, Wu, and Nishimoto et al, as applied to claim 7 above; and further in view of Davydov et al, (US-PGPUB 20180315200)
The combine teaching Miyake, Wu, and Nishimoto as whole discloses the limitations of claim 1.
The combine teaching Miyake, Wu, and Nishimoto as whole does not expressly disclose wherein the purpose comprises a number-of-persons count, an intruder detection, or a specific activity detection.
However, Davydov discloses wherein the purpose comprises a number-of-persons count, an intruder detection, or a specific activity detection, (see at least: Par. 0070, generating a human skeleton and to detect the pose and activity of a human, “i.e., specific activity detection”).
Miyake, Wu, Nishimoto, and Davydov are combinable because they are all concerned with skeleton detection. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify the combine teaching Miyake, Wu, and Nishimoto, to identify the anatomical landmarks of human bodies within the frame as though by Davydov, in order to detect an activity of a human, (Davydov, Par. 0070)

Allowable Subject Matter
Claim 20 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

With respect to claim 20, the prior art of record, alone or in reasonable combination, does not teach or suggest, the following limitation(s), (in consideration of the claim as a whole):  
	“wherein a branch determination based on a detection rate of a key point in a set of two or more preceding images to the target image when the distance is less than or equal to one hundred (100) meters”

The relevant prior art of record, Miyake et al, (US-PGPUB 20220414898) 
discloses a skeleton detection system, comprising: 
one or more arithmetic devices, (8 in Fig. 1); and one or more storage devices, (Fig. 1, storage unit”), 
wherein the one or more storage devices store a target image for use in skeleton detection, and wherein the target image is a frame image in a moving image, (see at least: Par. 0043, the storage unit 8B can also temporarily store image information of an image captured by the image capturing unit 2, “image information corresponds to the target image implicitly for use in skeleton detection”. Further, from Par. 0027, the image capturing unit 2, is implicitly video camera providing video stream, and the target image shown in Fig. 3, is implicitly is a frame in a moving image), and 
a plurality of skeleton detection models respectively corresponding to a plurality of skeleton definition models configured to define different skeletons, (see at least: Fig. 3, and Par. 0051-0052, the skeleton model generating unit 8b estimates and generates the skeleton model MDL representing the person P in the image I, where the generated models P1(P) and P2(P) shown in Fig. 3, represent the plurality of skeleton detection models, corresponding to person P1 standing up, and a person P2  sitting down on a seat S, respectively, and where the generated models P1(P) and P2(P) implicitly define different skeletons); and 
wherein the one or more arithmetic devices:
determine a predetermined condition for the skeleton detection of the target image, (see at least: Par. 0055, determination unit 8c of the present embodiment determines the state of the person P corresponding to the skeleton model MDL generated by the skeleton model generating unit 8b, by distinguishing between the person P standing up and the person P sitting down, [i.e., determine a predetermined condition, “person P standing up and the person P sitting down”]. The limitation: “for the skeleton detection of the target image”, is an intended use in the claim);
selecting a first skeleton detection model from the plurality of skeleton detection models based on a result of the determination of the predetermined condition, based on a skeleton detection result in a past frame image in the moving frame, (see at least: Par. 0052, the skeleton model generating unit 8b estimates and generates the skeleton model MDL representing the person P in the image I, from the image I captured by the image capturing unit 2 using various known methods such as a background subtraction method, a mean shift method, pattern matching, various machine learning methods, and the like; and from Par. 0057, the fall prevention system 1 stores in advance in the learned mathematical model for state determination and the like obtained by the machine learning into the storage unit 8B, where the fall prevention system 1, performs machine learning using “whether the person P is standing up, is sitting down, or has fallen over” and the like as objective variables; and from Par. 0055, distinguishing between the person P standing up and the person P sitting down, “one of a condition related to a content of the target image”, [i.e., selecting the first skeleton detection model from the plurality of skeleton detection models, “estimates and generates the skeleton model MDL representing the person P in the image I”, based on a result of the determination of the predetermined condition, “implicitly based on distinguishing between the person P standing up and the person P sitting down, “one of a condition related to a content of the target image”]); and 
executing the skeleton detection of the target image by the first skeleton detection model, (see at least: Par. 0058, the determination unit 8c determines the state of the person corresponding to the skeleton model MDL, by classification and regression based on the learned mathematical model for state determination or the like stored in the storage unit 8B as described above, [implicitly executing the skeleton detection of the target image, “distinguishing between the person P standing up, the person P sitting down, and the person P fallen over”, by the first skeleton detection model, “MDL”]).
However, while disclosing selecting a first skeleton detection model; Miyake et al fails to teach or suggest, either alone or in combination with the other cited references, a branch determination based on a detection rate of a key point in a set of two or more preceding images to the target image when the distance is less than or equal to one hundred (100) meters

A further prior art of record, Nishimoto et al, (US-PGPUB 20110304540) discloses selecting a first skeleton detection model from the plurality of skeleton detection models based on a distance between a person in the target image and an imaging device for the target image, (see at least: Fig. 2A, and Par. 0162,  the depth information (i.e., information about the distance from the object), may be acquired using a CMOS sensor, distance sensor, or the like; and from Fig. 3, Par. 0164-0165, skeleton information (motion information in a broad sense) used to specify the motion of the operator is acquired based on the depth information shown in FIG. 2B. The three-dimensional shape of the operator or the like viewed from the image sensor ISE can be acquired using the depth information shown in FIG. 2B; and a matching process is performed on the body shape/physique of the operator and the body shape/physique of the plurality of models using the depth information and the color image information about the operator obtained using the image sensor ISE to specify a model having a body shape/physique similar to that of the operator, [i.e., selecting a first skeleton detection model from the plurality of skeleton detection models, “implicit by specifying a model having a body shape/physique similar to that of the operator by performing matching process”, based on a distance between a person in the target image and an imaging device for the target image, “based on the depth information or information about the distance from the object)
However, while disclosing selecting a first skeleton detection model from the plurality of skeleton detection models based on a distance between a person in the target image and an imaging device for the target image; Nishimoto et al fails to teach or suggest, either alone or in combination with the other cited references, a branch determination based on a detection rate of a key point in a set of two or more preceding images to the target image when the distance is less than or equal to one hundred (100) meters

Beck et al, (US-PGPUB 20190354777) discloses skeleton model 504, which has a multiple branching structure comprises a plurality of nodes at various points corresponding to various joints/parts of a skeleton, (Fig. 5, Par. 0077); but fails to teach or suggest, either alone or in combination with the other cited references, the above limitations (as combined with the other claimed limitations).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.




Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMARA ABDI whose telephone number is (571)272-0273. The examiner can normally be reached 9:00am-5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached at (571) 272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/AMARA ABDI/Primary Examiner, Art Unit 2668                                                                                                                                                                                            10/07/2025
Read full office action
Prosecution Timeline

Mar 01, 2023
Application Filed
May 03, 2025
Non-Final Rejection — §103
Jul 10, 2025
Response Filed
Oct 07, 2025
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/569,692
Patent 12602822
METHOD DEVICE AND STORAGE MEDIUM FOR BACK-END OPTIMIZATION OF SIMULTANEOUS LOCALIZATION AND MAPPING
2y 5m to grant Granted Apr 14, 2026
18/962,814
Patent 12597252
METHOD OF TRACKING OBJECTS
2y 5m to grant Granted Apr 07, 2026
18/288,713
Patent 12576595
SYSTEMS AND METHODS FOR IMPROVED VOLUMETRIC ADDITIVE MANUFACTURING
2y 5m to grant Granted Mar 17, 2026
18/222,744
Patent 12574469
VIDEO SURVEILLANCE SYSTEM, VIDEO PROCESSING APPARATUS, VIDEO PROCESSING METHOD, AND VIDEO PROCESSING PROGRAM
2y 5m to grant Granted Mar 10, 2026
18/222,360
Patent 12563154
VIDEO SURVEILLANCE SYSTEM, VIDEO PROCESSING APPARATUS, VIDEO PROCESSING METHOD, AND VIDEO PROCESSING PROGRAM
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
83%
Grant Probability
76%
With Interview (-7.5%)
2y 7m
Median Time to Grant
Moderate
PTA Risk
Based on 816 resolved cases by this examiner. Grant probability derived from career allow rate.