Last updated: April 19, 2026
Application No. 18/172,504
IMAGE PROCESSING DEVICE, IMAGE PROCESSING METHOD, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM WITH ESTIMATING JOINT POINT OF PERSON

Non-Final OA §103
Filed
Feb 22, 2023
Examiner
GOEBEL, EMMA ROSE
Art Unit
2662
Tech Center
2600 — Communications
Assignee
Canon Kabushiki Kaisha
OA Round
3 (Non-Final)
Interview Optional

— +47.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 45 resolved cases, 2023–2026
Examiner Intelligence

GOEBEL, EMMA ROSE View full profile →
Grants 53% of resolved cases
Career Allow Rate
24 granted / 45 resolved
-8.7% vs TC avg
Strong +47% interview lift
Without
With
+47.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
40 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
18.2%
-21.8% vs TC avg
§103
60.1%
+20.1% vs TC avg
§102
11.8%
-28.2% vs TC avg
§112
8.4%
-31.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 45 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Acknowledgement is made of Applicant’s claim of priority from JP2022-034769, filed March 7, 2022.

Information Disclosure Statement
The information disclosure statement (“IDS”) filed on November 20, 2025 was reviewed and the listed references were noted.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on November 20, 2025 has been entered.
 


Status of Claims
Claims 1-3 and 5-21 are pending. Claim 4 has been cancelled. Claims 19-21 are newly added.

Response to Arguments
Applicant's arguments filed November 20, 2025 have been fully considered but are moot because of the new grounds of rejection, presented in the sections below. Applicant argues that none of the previously applied references teach the newly added limitations. However, the Pan reference, presented in the rejections of claims 11 and 15, teaches determining a joint point and calculating a score (i.e., reliability) of the joint point belonging to a person and assigning the joint point to the person for whom the reliability is higher. Examiner asserts that this reference is sufficient to teach the newly added limitation of “wherein (a) a joint point of the reference person is estimated, (b) a reliability of the joint point is calculated, and (c) a region including the joint point which has the reliability greater than a predetermined threshold is determined in the determining as the processing region”.

Applicant further argues that there is no motivation to arrive at the combination of references presented herein. Examiner respectfully disagrees. As presented in the 35 USC 103 rejection below, the motivations to combine the Divakaran, Okami, Herley and Pan references are expressly stated and taken from the references themselves. For example, as stated below, one having ordinary skill in the art would have been motivated to combine the Pan reference with the Divakara, Okami and Herley reference because doing so would allow for improving estimation accuracy when estimating a person from an image, as recognized by Pan. Therefore, the 35 USC 103 rejection of the claims is upheld.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-2, 9, 11, 13-15 and 18-21 are rejected under 35 U.S.C. 103 as being unpatentable over Divakaran et al. (US 2014/0347475 A1) in view of Okami et al. (US 2021/0174526 A1) further in view of Cormac Herley (US 2004/0264806 A1) and Yadong Pan (US 2024/0303855 A1, with priority to PCT/JP2021/001248, filed January 15, 2021, PGPub used herein as a translation and for mapping purposes).

Regarding claim 1, Divakaran teaches an image processing device comprising: 
at least one processor (Divakaran, Para. [0092], the illustrative computing device includes at least one processor); and
at least one memory having stored therein instructions which, when executed by the at least one processor (Divakara, Para. [0092], the processor and the I/O subsystem are communicatively coupled to the memory. Para. [0094], the storage media may include one or more hard drives or other suitable data storage devices (e.g., flash memory, memory cards, memory sticks, and/or others). In some embodiments, portions of the OT node subsystem, the video stream, the detection stream, the track stream, the occlusion maps, the calibration db 162, and/or other data reside at least temporarily in the storage media), cause the image processing device at least to:
(1) detect a plurality of persons in an image (Divakaran, Para. [0038], the human detection module relays the geo-positions of any detected persons as well as any regions-of-interest (ROIs) of the detected person(s) in each of the individual image/frames to the real-time tracking module as a detection stream),
(2) determine whether a reference person is concealed by another person in the image (Divakaran, Para. [0048], when the occlusion results from the presence of another person (as may be the case when people are walking together in a group), the body parts of multiple people may appear to overlap and it may be difficult to tell which body parts belong to which of the persons in the group).
Although Divakaran teaches occlusions resulting from overlapping persons (Divakaran, Para. [0048]) and teaches an occlusion reasoning engine to explicitly reason about the occlusion of various body parts (Divakaran, Para. [0049]), Divakaran does not explicitly teach “(3) determine as a processing region, in accordance with a state of the concealment, at least a part or all of a region of the another person in the image” and “(4) estimate, in a processed image obtained by performing a process on the processing region in the image, a joint point of the reference person”. However, in an analogous field of endeavor, Okami teaches an occlusion region refers to a part or the whole region of a subject in the background hidden by a subject in the foreground overlapping with that subject (i.e., at least a part or all of a region of the another person in the image) (Okami, Para. [0032]). Okami further teaches an estimation device estimates information on a joint in an occlusion region of a subject in the background from an image containing a plurality of subjects overlapping with each other in the depth direction (Okami, Para. [0031]). Okami teaches the process performed on the processing region to generate the joint information of the person involves generating time-series information based on the difference between a target image and each of the other images than the target image in the obtained image group, receiving the target image and time-series information and generating depth information in the target image and silhouette information on a person captured in the target image, and receiving the silhouette information and generating joint information (Okami, Para. [0048]-[0050]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the device of Divakaran with the teachings of Okami by including determining a processing region (i.e., occlusion region) of part or all of a subject in a background hidden by a subject in the foreground and performing processing on the processing region to estimate joint information of the person. One having ordinary skill in the art would have been motivated to combine these references because doing so would allow for estimating joint information for a subject even if there is a plurality of persons captured in an image, as recognized by Okami.
Although Divakaran in view of Okami teaches performing processing on an occlusion region to determine joint information (Okami, Para. [0048]-[0050]), they do not explicitly teach “wherein the process performed on the processing region is at least one of deformation, softening, mosaic processing, and blacking out of pixels”. However, in an analogous field of endeavor, Herley teaches generating a mosaic or composite image having reduced or eliminated areas of occlusion relative to any of the input images by forming a composite image from non-occluded regions of the input images (Herley, Para. [0057]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the device of Divakaran in view of Okami with the teachings of Herley by including performing a mosaic processing on the processing region by generating a mosaic or composite image from non-occluded regions of the input images. One having ordinary skill in the art would have been motivated to combine these references, because doing so would allow for reducing or eliminating areas of occlusion in an image, as recognized by Herley. Thus, the claimed invention would have been obvious to one having ordinary skill in the art before the effective filing date.
Although Divakaran in view of Okami further in view of Herley teaches performing processing on an occlusion region to determine joint information (Okami, Para. [0048]-[0050]), they do not explicitly teach “wherein (a) a joint point of the reference person is estimated, (b) a reliability of the joint point is calculated, and (c) a region including the joint point which has the reliability greater than a predetermined threshold is determined in the determining as the processing region”. However, in an analogous field of endeavor, Pan teaches detecting a joint point of a person in an image (i.e., a joint point of the reference person is estimated) (Pan, Para. [0065]). The attribution determination unit calculates a score indicating the possibility that each joint point belongs to the person in the image (i.e., a reliability of the joint point is calculated) and determines the person in the image to which the joint point belongs by using the calculated score (Pan, Para. [0067]). The score for the person 41 is smaller than the score for the person 42, and therefore the attribute determination unit determines the person to which the joint point belongs as the person 41 (i.e., region including the joint point has higher reliability (lower score) for person 41 than person 42. The reliability for person 41 is greater than the “predetermined threshold” that is the reliability of person 42) (Pan, Para. [0085]). The posture estimation unit then estimates the posture of the person based on the detection result by the joint point detection unit (i.e., region including the joint point with greater reliability is determined as the processing region) (Pan, Para. [0088]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the device of Divakaran in view of Okami further in view of Herley with the teachings of Pan by including determining a score indicating the possibility that each joint point belongs to the person in the image and determining the person in the image to which the joint point belongs by using the calculated score, and then performing processing on part of the person based on the detection result. One having ordinary skill in the art would have been motivated to combine these references, because doing so would allow for improving estimation accuracy when estimating a person from an image, as recognized by Pan. Thus, the claimed invention would have been obvious to one having ordinary skill in the art before the effective filing date.

Regarding claim 2, Divakaran in view of Okami further in view of Herley and Pan teaches the image processing device according to claim 1, further comprising a generating unit configured to wherein the instructions, when executed by the at least one processor, further cause the image processing device to generate concealment information obtained by determining (1) whether a region of the reference person is concealed by the region of the another person (Okami, Para. [0032], an occlusion region refers to a part or the whole region of a subject in the background hidden by a subject in the foreground overlapping with that subject) and (2) whether the region of the another person is located below the region of the reference person in the image, based on the region of the reference person and the region of the another person in the image (Divakaran, Para. [0052], when a person’s legs are temporarily occluded in an image of the video stream (i.e., the region of another person is below the region of the reference person), the person can still be detected and tracked from the upper-body).
The proposed combination as well as the motivation for combining the Divakaran, Okami, Herley and Pan references presented in the rejection of Claim 1, apply to Claim 2 and are incorporated herein by reference.  Thus, the device recited in Claim 2 is met by Divakaran in view of Okami further in view of Herley and Pan.

Regarding claim 9, Divakaran in view of Okami further in view of Herley and Pan teaches the image processing device according to claim 1, and further teaches wherein
a joint point of the reference person and a joint point of the another person in the image are estimated (Divakaran, Para. [0039], the tracking module analyzes temporal sequences of frames in the video stream in order to track the movement of detected persons or objects over time. To do this, the tracking module initiates and maintains a local track stream for each of the individuals that are detected by the human detection module), and
wherein in a case where the joint point of the reference person is concealed by the joint point of the another person, a part or all of joint points of the another person  in the image is determined as the processing region according to a state of the concealment (Para. [0040], the scene awareness module executes a projective transformation to map the static and dynamic masks from the image to the ground plane to create static and dynamic occlusion zones in geo-space. The static and/or dynamic occlusion zones may include personal occlusion zones, as described below. The occlusion zones detected and mapped by the scene awareness module are used in occlusion reasoning for both detection and tracking. Para. [0051], an occlusion may occur when both the tracked object and the occluder are moving (such as when many people are walking together in a group)).

Regarding claim 11, Divakaran in view of Okami further in view of Herley and Pan teaches the image processing device according to claim 9, as described and further teaches wherein, in a case where detection reliability of each of the region of the joint point of the reference person and the region of the joint point of the another person on the image is less than a threshold value, a part of the region of the joint point of the other person is determined as the processing region (Pan, Para. [0067], calculating a score indicating the possibility that each joint point belongs to the person in the image and determining the person in the image to which the joint point belongs by using the calculated score. Para. [0085], the score for the person 41 is smaller than the score for the person 42, and therefore the attribute determination unit determines the person to which the joint point belongs as the person 41. Para. [0088], the posture estimation unit then estimates the posture of the person based on the detection result by the joint point detection unit).
The proposed combination as well as the motivation for combining the Divakaran Okami, Herley and Pan references presented in the rejection of Claim 1, apply to Claim 11 and are incorporated herein by reference.  Thus, the device recited in Claim 11 is met by Divakaran in view of Okami further in view of Herley and Pan.

Claim 13 recites a method with steps corresponding to the elements of the system recited in Claims 1. Therefore, the recited steps of this claim are mapped to the proposed combination in the same manner as the corresponding elements in its corresponding system claim. Additionally, the rationale and motivation to combine the Divakaran, Okami, Herley and Pan references, presented in rejection of Claim 1, apply to this claim.

Claim 14 recites a computer-readable storage medium storing a program with instructions corresponding to the steps recited in Claim 13.  Therefore, the recited programming instructions of this claim are mapped to the proposed combination in the same manner as the corresponding steps in its corresponding method claim.  Additionally, the rationale and motivation to combine the Divakaran, Okami, Herley and Pan references, presented in rejection of Claim 1, apply to this claim.  Finally, the Divakaran, Okami, Herley and Pan references disclose a computer readable storage medium (Divakaran, Para. [0109], instructions stored using one or more machine-readable media).

Regarding claim 15, Divakaran in view of Okami further in view of Herley and Pan teaches the image processing device according to claim 1, and further teaches wherein the joint point is any of an eye, a nose, an ear, a shoulder, an elbow, a wrist, a waist, a knee, or an ankle  (Pan, Para. [0072], examples of the joint points to be detected include the right shoulder, right elbow, right wrist, right hip joint, right knee, right ankle, left shoulder, left elbow, left wrist, left hip joint, left knee, and left ankle).
The proposed combination as well as the motivation for combining the Divakaran, Okami, Herley and Pan references presented in the rejection of Claim 1, apply to Claim 15 and are incorporated herein by reference.  Thus, the device recited in Claim 15 is met by Divakaran in view of Okami further in view of Herley and Pan.

Regarding claim 18, Divakaran in view of Okami further in view of Herley and Pan teaches the image processing device according to claim 1, and further teaches wherein the process performed on the processing region includes mosaic processing (Herley, Para. [0057], generating a mosaic or composite image having reduced or eliminated areas of occlusion relative to any of the input images by forming a composite image from non-occluded regions of the input images).
The proposed combination as well as the motivation for combining the Divakaran, Okami, Herley and Pan references presented in the rejection of Claim 1, apply to Claim 18 and are incorporated herein by reference.  Thus, the device recited in Claim 18 is met by Divakaran in view of Okami further in view of Herley and Pan.

Regarding claim 19, Divakaran in view of Okami further in view of Herley and Pan teaches the image processing device according to claim 1, wherein after estimating a joint point of the another person, the region including the joint point which has the reliability greater than a predetermined threshold is determined as the processing region (Pan, Para. [0085], The score for the person 41 is smaller than the score for the person 42, and therefore the attribute determination unit determines the person to which the joint point belongs as the person 41 (i.e., region including the joint point has higher reliability (lower score) for person 41 than person 42. The reliability for person 41 is greater than the “predetermined threshold” that is the reliability of person 42). Para. [0088], the posture estimation unit then estimates the posture of the person based on the detection result by the joint point detection unit (i.e., region including the joint point with greater reliability is determined as the processing region)).
The proposed combination as well as the motivation for combining the Divakaran Okami, Herley and Pan references presented in the rejection of Claim 1, apply to Claim 19 and are incorporated herein by reference.  Thus, the device recited in Claim 19 is met by Divakaran in view of Okami further in view of Herley and Pan.

Regarding claim 20, Divakaran in view of Okam further in view of Herley and Pan teaches the image processing device according to claim 19, wherein after estimating a joint point of the reference person and the another person, (1) candidates of the processing region are listed based on the joint point (Pan, Para. [0087], Specifically, for example, as shown in FIG. 10, it is assumed that two of the joint points P1 and P2 belong to the person 42. In this case, the person 42 includes two left wrists, which is unnatural (i.e., joint points P1 and P2 are the candidates)), and (2) the processing region is determined based on a list of the candidates (Pan, Para. [0087], Therefore, the attribution correction unit 36 acquires the score calculated for the joint point P1 and the score calculated for the joint point P2 from the attribution determination unit 33, compares the two score. Then, the attribution correction unit 36 determines that the joint point having the larger score, that is, the joint point P1 in this case, does not belong to the person 42. As a result, the attribution of the joint points of the person is corrected).
The proposed combination as well as the motivation for combining the Divakaran Okami, Herley and Pan references presented in the rejection of Claim 1, apply to Claim 20 and are incorporated herein by reference.  Thus, the device recited in Claim 20 is met by Divakaran in view of Okami further in view of Herley and Pan.

Regarding claim 21, Divakaran in view of Okami further in view of Herley and Pan teaches the image processing device according to claim 20, wherein the list of the candidates of the processing region of the reference person is compared with the list of the candidates of the processing region of the another person, and wherein in a case where the candidates of the processing region of the reference person overlaps the candidates of the processing region of the another person, the candidates of the processing region is left in the list as the candidate of the processing region (Pan, Para. [0086], the attribution correction unit 36 compares the scores at each of the overlapping joint points when the overlapping joint points are included in the joint points determined to belong to the same person in the image. The attribution correction unit 36 determines that any of the overlapping joint points does not belong to the person based on the comparison result).
The proposed combination as well as the motivation for combining the Divakaran Okami, Herley and Pan references presented in the rejection of Claim 1, apply to Claim 21 and are incorporated herein by reference.  Thus, the device recited in Claim 21 is met by Divakaran in view of Okami further in view of Herley and Pan.

Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Divakaran et al. (US 2014/0347475 A1) in view of Okami et al. (US 2021/0174526 A1) further in view of Cormac Herley (US 2004/0264806 A1) and Yadong Pan (US 2024/0303855 A1, with priority to PCT/JP2021/001248, filed January 15, 2021, PGPub used herein as a translation and for mapping purposes), as applied to claims 1-2, 9, 11, 13-15 and 18-21 above, and further in view of Zucker et al. (US 2021/0166417 A1).

Regarding claim 3, Divakaran in view of Okami further in view of Herley and Pan teaches the image processing device according to claim 1, as described above.
Although Divakaran in view of Okami further in view of Herley and Pan teaches forming a mosaic image to reduce or eliminate areas of occlusion (Herley, Para. [0057]), they do not explicitly teach “wherein the process performed on the processing region includes blacking out of pixels”. However, in an analogous field of endeavor, Zucker teaches an occluded item recognizer preprocesses pixels of the occluded image to remove and black out all pixels present in the occluded image which are unlikely to be associated with the unknown item identified by the item tracker. This allows the trained machine-learning or neural network to focus on just those remaining unique pixels of the occluded image that are most likely only associated with the unknown item (Zucker, Para. [0036]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the device of Divakaran in view of Okami further in view of Herley and Pan with the teachings of Zucker by including blacking out pixels in the occluded image (i.e., processing region) to allow for identifying the unknown item (i.e., estimating joint point). One having ordinary skill in the art would have been motivated to combine these references because doing so would allow for more accurate prediction in an occluded image, as recognized by Zucker. Thus, the claimed invention would have been obvious to one having ordinary skill in the art before the effective filing date.

Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Divakaran et al. (US 2014/0347475 A1) in view of Okami et al. (US 2021/0174526 A1) further in view of Cormac Herley (US 2004/0264806 A1) and Yadong Pan (US 2024/0303855 A1, with priority to PCT/JP2021/001248, filed January 15, 2021, PGPub used herein as a translation and for mapping purposes), as applied to claims 1-2, 9, 11, 13-15 and 18-21 above, and further in view of Endo et al. (US 2016/0044222 A1).

Regarding claim 5, Divakaran in view of Okami further in view of Herley and Pan teaches the image processing device according to claim 1, and further teaches 
wherein a process of processing the processing region in the image is performed based on a result of the determination (Divakaran, Para. [0040], the illustrative scene awareness module computes static and dynamic occlusion regions in the frames of the video stream and generates the static and dynamic occlusion maps. Static occluders detected in the scene are marked with static masks using a distinguishing color and dynamic occluders are marked with dynamic masks in a different distinguishing color (i.e., color conversion)).
Although Divakaran in view of Okami further in view of Herley and Pan teaches marking masks of occlusions regions with a distinguishing color (Divakaran, Para. [0040]), they do not explicitly teach “wherein the instructions, when executed by the at least one processor, further cause the image processing device to determine color information to be set to the processing region based on color information of at least one of clothes of the reference person, a part of the another person, and a periphery of the processing region”. However, in an analogous field of endeavor, Endo teaches setting a detection condition based on the characteristic color information relating to the clothes color of the person who is identified as the subject (Endo, Para. [0073]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the device of Divakaran in view of Okami further in view of Herley and Pan with the teachings of Endo by including the color information to be set to the processing region based on the color of the clothes of the reference person. One having ordinary skill in the art would have been motivated to combine these references, because doing so would allow for detecting a state of a subject with reduced detection errors, as recognized by Endo. Thus, the claimed invention would have been obvious to one having ordinary skill in the art before the effective filing date.

Claims 6-8, 10 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Divakaran et al. (US 2014/0347475 A1) in view of Okami et al. (US 2021/0174526 A1) further in view of Cormac Herley (US 2004/0264806 A1) and Yadong Pan (US 2024/0303855 A1, with priority to PCT/JP2021/001248, filed January 15, 2021, PGPub used herein as a translation and for mapping purposes), as applied to claims 1-2, 9, 11, 13-15 and 18-21 above, and further in view of Miyazaki et al. (US 2018/0047181 A1).

Regarding claim 6, Divakaran in view of Okami further in view of Herley and Pan teaches the image processing device according to claim 1, as described above.
Although Divakaran in view of Okami further in view of Herley and Pan teaches using static and dynamic occlusions in images as processing regions (Divakaran, Para. [0040]), they do not explicitly teach “wherein, in a case where a size of a portion, in the region of the another person, overlapping a region of the reference person exceeds a threshold value, a part of the region of the another person is determined as the processing region”. However, in an analogous field of endeavor, Miyazaki teaches determining whether or not an overlapping region is equal to or greater than the size of the person region (i.e., threshold value) and when it is, proceeding to estimating the position coordinates and the search range of the person (i.e., determine processing region as a part of the region of the other person) (Miyazaki, Paras. [0133]-[0134]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date to modify the device of Divakaran in view of Okami further in view of Herley and Pan with the teachings of Miyazaki by including determining if the overlapping region exceeds a threshold value and setting the processing region as a part of the region of the other person based on the overlapping region exceeding the threshold value. One having ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to combine these references because doing so would allow for reducing erroneous determination of the position of a target person, as recognized by Miyazaki. Thus, the claimed invention would have been obvious to one having ordinary skill in the art before the effective filing date.

Regarding claim 7, Divakaran in view of Okami further in view of Herley and Pan teaches the image processing device according to claim 1, as described above.
Although Divakaran in view of Okami further in view of Herley and Pan teaches using static and dynamic occlusions in images as processing regions (Divakaran, Para. [0040]), they do not explicitly teach “wherein, in a case where a size of a portion, in the region of the another person overlapping a region of the reference person does not exceed a threshold value, all of the region of the another person is determined as the processing region”. However, in an analogous field of endeavor, Miyazaki teaches when the overlapping region of the search ranges is not equal to or greater than the size of the person region (i.e., threshold value) the change unit fixes the estimated search range as the search range of the person (i.e., determines all of the region as the processing region) (Miyazaki, Para. [0135]).
The proposed combination as well as the motivation for combining the Divakaran, Okami, Herley, Pan and Miyazaki references presented in the rejection of Claim 6, apply to Claim 7 and are incorporated herein by reference.  Thus, the system recited in Claim 7 is met by Divakaran in view of Okami further in view of Herley, Pan and Miyazaki.

Regarding claim 8, Divakaran in view of Okami further in view of Herley and Pan teaches the image processing device according to claim 2, as described above.
Although Divakaran in view of Okami further in view of Herley and Pan teaches using static and dynamic occlusions in images as processing regions (Divakaran, Para. [0040]), they do not explicitly teach “wherein a size of the processing region is determined based on at least one of (1) a size of a portion, in the region of the another person, overlapping the region of the reference person, (2) a position of the region of the another person with respect to the region of the reference person, (3) a distance between a center coordinate of the region of the reference person and a center coordinate of the region of the another person, and (4) number of other persons concealing the reference person”. However, in an analogous field of endeavor, Miyazaki teaches setting a search range (i.e., processing region) based on a size of an overlapping region (Miyazaki, Paras. [0133]-[0135]).
The proposed combination as well as the motivation for combining the Divakaran, Okami, Herley, Pan and Miyazaki references presented in the rejection of Claim 6, apply to Claim 8 and are incorporated herein by reference.  Thus, the system recited in Claim 8 is met by Divakaran in view of Okami further in view of Herley, Pan and Miyazaki.

Regarding claim 10, Divakaran in view of Okami further in view of Herley and Pan teaches the image processing device according to claim 9, as described above.
Although Divakaran in view of Okami further in view of Herley and Pan teaches using static and dynamic occlusions in images as processing regions (Divakaran, Para. [0040]), they do not explicitly teach “wherein a size of the processing region is determined based on at least one of (1) a size of a portion, in the region of the another person, overlapping the region of the reference person, (2) a position of the region of the another person with respect to the region of the reference person, (3) a distance between a center coordinate of the region of the reference person and a center coordinate of the region of the another person, and (4) number of other persons concealing the reference person”. However, in an analogous field of endeavor, Miyazaki teaches setting a search range (i.e., processing region) based on a size of an overlapping region (Miyazaki, Paras. [0133]-[0135]).
The proposed combination as well as the motivation for combining the Divakaran, Okami, Herley, Pan and Miyazaki references presented in the rejection of Claim 6, apply to Claim 10 and are incorporated herein by reference.  Thus, the system recited in Claim 10 is met by Divakaran in view of Okami further in view of Herley, Pan and Miyazaki.

Regarding claim 12, Divakaran in view of Okami further in view of Herley, Pan and Miyazaki teaches the image processing device according to claim 8, and further teaches wherein, in a case where detection reliability of each of a region of joint points of the reference person and a region of joint points of the another person in the image is greater than a threshold value, all of the region of the joint points of the another person is determined as the processing region (Pan, Para. [0067], calculating a score indicating the possibility that each joint point belongs to the person in the image and determining the person in the image to which the joint point belongs by using the calculated score. Para. [0085], the score for the person 41 is smaller than the score for the person 42, and therefore the attribute determination unit determines the person to which the joint point belongs as the person 41. Para. [0088], the posture estimation unit then estimates the posture of the person based on the detection result by the joint point detection unit).
The proposed combination as well as the motivation for combining the Divakaran, Okami, Herley, Pan and Miyazaki references presented in the rejection of Claim 6, apply to Claim 12 and are incorporated herein by reference.  Thus, the system recited in Claim 12 is met by Divakaran in view of Okami further in view of Herley, Pan and Miyazaki.

Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Divakaran et al. (US 2014/0347475 A1) in view of Okami et al. (US 2021/0174526 A1) further in view of Cormac Herley (US 2004/0264806 A1) and Yadong Pan (US 2024/0303855 A1, with priority to PCT/JP2021/001248, filed January 15, 2021, PGPub used herein as a translation and for mapping purposes), as applied to claims 1-2, 9, 11, 13-15 and 18-21 above, and further in view of Xiaodong Huang (US 2016/0307350 A1).

Regarding claim 16, Divakaran in view of Okami further in view of Herley and Pan teaches the image processing device according to claim 1, as described above.
Although Divakaran in view of Okami further in view of Herley and Pan teaches forming a mosaic image to reduce or eliminate areas of occlusion (Herley, Para. [0057]), they do not explicitly teach “wherein the process performed on the processing region includes deformation”.  However, in an analogous field of endeavor, Huang teaches determining an initial homography of a 3x3 matrix that will approximately transform all of the pixels of the right image to the perspective of the overlapped region of the left image (Huang, Para. [0121]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the device of Divakaran in view of Okami further in view of Herley and Pan with the teachings of Huang by including performing deformation (i.e., homography transformation) on the processing region. One having ordinary skill in the art would have been motivated to combine these references because doing so would allow for optimized panoramic stitching, as recognized by Huang. Thus, the claimed invention would have been obvious to one having ordinary skill in the art before the effective filing date.

Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Divakaran et al. (US 2014/0347475 A1) in view of Okami et al. (US 2021/0174526 A1) further in view of Cormac Herley (US 2004/0264806 A1) and Yadong Pan (US 2024/0303855 A1, with priority to PCT/JP2021/001248, filed January 15, 2021, PGPub used herein as a translation and for mapping purposes), as applied to claims 1-2, 9, 11, 13-15 and 18-21 above, and further in view of Ma et al. (US 2015/0120244 A1).

Regarding claim 17, Divakaran in view of Okami further in view of Herley and Pan teaches the image processing device according to claim 1, as described above.
Although Divakaran in view of Okami further in view of Herley and Pan teaches forming a mosaic image to reduce or eliminate areas of occlusion (Herley, Para. [0057]), they do not explicitly teach “wherein the process performed on the processing region includes softening”. However, in an analogous field of endeavor, Ma teaches estimating road widths even in cases where there are occlusions, such as by using a smoothing algorithm (Ma, Para. [0080]). Ma further teaches the smoothing algorithm is based on an arbitrary number N of neighbors, in which case the road width is adjusted to be either the mean value or median value of road widths of the N sections of the point cloud centered on the section being evaluated (Ma, Para. [0055]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the device of Divakaran in view of Okami further in view of Herley and Pan with the teachings of Ma by including performing a smoothing algorithm (i.e., softening) on the processing region. One having ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to combine these references, because doing so would allow for still performing estimates even in cases where there are occlusions, as recognized by Ma. Thus, the claimed invention would have been obvious to one having ordinary skill in the art before the effective filing date.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Emma Rose Goebel whose telephone number is (703)756-5582. The examiner can normally be reached Monday - Friday 7:30-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Amandeep Saini can be reached at (571) 272-3382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/Emma Rose Goebel/Examiner, Art Unit 2662                                                                                                                                                                                                        /AMANDEEP SAINI/Supervisory Patent Examiner, Art Unit 2662
Read full office action
Prosecution Timeline

Feb 22, 2023
Application Filed
Apr 21, 2025
Non-Final Rejection — §103
Jul 21, 2025
Applicant Interview (Telephonic)
Jul 21, 2025
Examiner Interview Summary
Jul 25, 2025
Response Filed
Aug 14, 2025
Final Rejection — §103
Nov 20, 2025
Request for Continued Examination
Dec 01, 2025
Response after Non-Final Action
Jan 14, 2026
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/146,581
Patent 12597236
FINE-TUNING JOINT TEXT-IMAGE ENCODERS USING REPROGRAMMING
2y 5m to grant Granted Apr 07, 2026
18/155,081
Patent 12597129
METHOD FOR ANALYZING IMMUNOHISTOCHEMISTRY IMAGES
2y 5m to grant Granted Apr 07, 2026
18/462,431
Patent 12597093
UNDERWATER IMAGE ENHANCEMENT METHOD AND IMAGE PROCESSING SYSTEM USING THE SAME
2y 5m to grant Granted Apr 07, 2026
18/568,996
Patent 12597124
DEBRIS DETERMINATION METHOD
2y 5m to grant Granted Apr 07, 2026
17/822,688
Patent 12588885
FAT MASS DERIVATION DEVICE, FAT MASS DERIVATION METHOD, AND FAT MASS DERIVATION PROGRAM
2y 5m to grant Granted Mar 31, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
53%
Grant Probability
99%
With Interview (+47.0%)
3y 0m
Median Time to Grant
High
PTA Risk
Based on 45 resolved cases by this examiner. Grant probability derived from career allow rate.
IMAGE PROCESSING DEVICE, IMAGE PROCESSING METHOD, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM WITH ESTIMATING JOINT POINT OF PERSON

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email