Last updated: April 19, 2026

Application No. 18/061,358

MAIN OBJECT DETERMINATION APPARATUS, IMAGE CAPTURING APPARATUS, AND METHOD FOR CONTROLLING MAIN OBJECT DETERMINATION APPARATUS

Non-Final OA §103

Filed

Dec 02, 2022

Examiner

CAMMARATA, MICHAEL ROBERT

Art Unit

2667

Tech Center

2600 — Communications

Assignee

Canon Kabushiki Kaisha

OA Round

3 (Non-Final)

Interview Optional

— +35.9% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 305 resolved cases, 2023–2026

Examiner Intelligence

CAMMARATA, MICHAEL ROBERT View full profile →

Grants 70% — above average

Career Allow Rate

213 granted / 305 resolved

+7.8% vs TC avg

Strong +36% interview lift

Without

With

+35.9%

Interview Lift

resolved cases with interview

Typical timeline

2y 4m

Avg Prosecution

46 currently pending

Career history

351

Total Applications

across all art units

Statute-Specific Performance

§101

4.5%

-35.5% vs TC avg

§103

45.8%

+5.8% vs TC avg

§102

21.1%

-18.9% vs TC avg

§112

24.6%

-15.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 305 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed 18 December 2025 have been fully considered but they are not persuasive. 
Applicant argues that Shimauchi’s posture recognition is used for scene recognition and layout control such that there is no suggestion of using posture-derived features quantities for same-object determination.
Applicant further argues that the concept of using posture-based candidate selection together with posture-based continuity determination as an integrated approach is not taught or suggested by Kosiakova and Shimauchi.  
In response to applicant's arguments against the references individually (the integrated approach argument summarized above), one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986).
In further response, Kosiakova clearly teaches main object selection based on, e.g., color-features {[0062]-[0068] including automatic (main) object detection from video clips using main object features which include colors, materials, and/or object associated with a main object} while Shimauchi teaches that posture-based and color-based tracking are interchangeable equivalents for distinguishing between main and secondary objects as per [0095]-[0097].  Moreover, Shimauchi establishes that object posture and/or color features may be recognized and used to distinguish between main object and secondary object.  
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to have modified Kosiakova main object selection system which already discloses an object detecting unit configured to detect a plurality of objects from the images and a selection unit configured to select at least one candidate of a main object from among the plurality of objects based on feature recognition using color features such that the Kosiakova’s main object selection method includes an obtaining unit configured to obtain posture information about each of the plurality of objects by obtaining positions of a plurality of joints of each of the plurality of objects and wherein Kosiakova’s selection unit selects at least one candidate of a main object from among the plurality of objects on the basis of the posture information about the plurality objects for each image acquired by the acquisition unit as taught by Shimauchi because joint position enables more accurate determination of the main subject and permits enhanced layout decisions based on a determination that the main subject is, for example, walking as motivated by Shimauchi in [0108]-[0115], [0097]-[0104], [0108]-[0115], Figs. 7A, 7B, 13, 16 and their corresponding disclosure sections) including [0177]-[0188]; because Shimauchi establishes that object posture is a feature that may be recognized and used to distinguish between main object and secondary objects such that the substitution of posture-based-main-object determination for color-based-main-object determination is further motivated by Shimauchi; because there is a reasonable expectation of success and/or because doing so merely combines prior art elements according to known methods to yield predictable results.
Furthermore, posture-based continuity determination does not materially differ, at least in terms of the claim language, from posture-based object tracking which determines the continuity of an object over multiple frames as disclosed by Shimauchi.  
Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claim 1, 2, 5, 7, 10, 12, and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Kosiakova (US 20210097291 A1) and Shimauchi (US 20230124466 A1).
Claim 1
	In regards to claim 1, Kosiakova discloses a main object determination apparatus comprising: 
one or more processors; and a memory storing instructions which, when executed by the one or more processors {Figs. 1A, 11 including server 108, [0043] processors 1114, instructions 1124 stored in memory 1112 executed by processor [0096]-[00104]}, cause the main object determination apparatus to function as: 
an acquisition unit configured to acquire images captured at different timings {Figs. 1 including video camera; , [0001], [0022], [0033] acquiring successive frames of video images (images captured at different timings), Fig. 3 step 302 receive video [0051]};
an object detecting unit configured to detect a plurality of objects from the images {See Fig. 2 main object component 214, [0042]-[0046], [0052]-[0055], Fig. 4, determine main object in frame 406 [0062]-[0068] including real-time object detection (e.g. YOLO or SDD) of plural objects};

a selection unit configured to select at least one candidate of a main object from among the plurality of objects 
{See Fig. 2 main object component 214, [0042]-[0046], Fig. 4, determine main object in frame 406 [0062]-[0068] including automatic (main) object detection from video clips using main object features which include colors, materials, and/or object associated with a main object}; and 
a determination unit configured to determine whether at least one candidate of the main object selected by the selection unit is the same in the images captured at different timings, wherein the main object is determined in a case where the determination unit determines that at least one candidate of the main object selected by the selection unit in an image of interest and in at least one image captured within a predetermined period of time before the image of interest is captured are the same,
{see [0044], [0051]-[0053], [0056]-[0057??], [0063], [0091]-[0093] in which the appearance time of objects frame-to-frame in a scene can be used by the main object component 214 to identify and distinguish main objects.  In other words, per [0042]-[0046], main object component 214 identifies a main object in a frame of the video using automatic identification including using feature analysis (information about a feature point) while also determining whether this main object candidate persists or is otherwise the same from frame-to-frame in the video.  
Further as to “image captured within a predetermined period of time before the image of interest is captured” note that this phrase merely describes a conventional video captured at a framerate (e.g. 30fps or 60 fps are common) such that each frame is captured within a predetermined period of time (e.g. 1/30sec or 1/60 sec) before the image of interest is captured.  See also [0051], [0056], [0060].  Kosiakova analyzes each of these time-sequential frames to determine whether the candidate object is the same from once frame to the next and such that the main object component 214 determines that at least one candidate of the main object selected by the selection unit in an image of interest and in at least one image captured within a predetermined period of time before the image of interest is captured are the same
See in particular [0044], [0052], [0091] in which the appearance of an object frame-to-frame in a scene can be used to distinguish the main object from background objects (other candidate objects that are not the main object) as per the determination},


Shimauchi is analogous art from the same field of main object (main subject) determination including video cameras 100 and control unit with posture estimation 331, tracking 332 and action recognition 333, Figs. 1, 2, 6, 11 and cites below.
Shimauchi teaches
an acquisition unit configured to acquire images captured at different timings {see Fig 2. Image capturing apparatus 100, [0075]-[0076] capturing video (images at different timings):
an object detecting unit configured to detect a plurality of objects from the images; an obtaining unit configured to obtain posture information about each of the plurality of objects by obtaining positions of a plurality of joints of each of the plurality of objects 
{Fig. 2, image processing apparatus 300 detects objects (subjects 10, 30) and posture estimation unit 331 estimates the postures of the objects.  In more detail, see Figs. 2-4, [0089] in which the captured image includes the objects (main subject 10 and secondary subjects 30) both of which are main object candidates and may select the main object based on a facial feature match as per [0096].  Moreover, posture estimation 331 estimates positions of the joints of the object as per [0090]-[0096] and tracking unit 332 tracks the main subject based on the posture information (including joint position)};
a selection unit configured to select at least one candidate of a main object from among the plurality of objects on the basis of the posture information about the plurality objects for each image acquired by the acquisition unit
{See the tracking unit 332 tracking the objects 10, 30 across multiple frames of the captured video and such tracking of the objects across frames (images captured at different timings) is also used to individually identify/select the main object from among the other objects, [0090]-[0092], [0095].  Shimauchi also teaches that posture-based and color-based tracking are interchangeable equivalents for distinguishing between main and secondary objects as per [0095]-[0097].  Moreover, Shimauchi establishes that object posture and/or color features may be recognized and used to distinguish between main object and secondary object. 
In addition, once the main object is selected Shimauchi adds an attribute/metadata item as per [0096] that selects the person determined to be the lecturer (main object).  Moreover, once the main object (lecturer) is selected then that main object’s action is recognized by action recognition unit 333 to determine downstream actions (e.g. cropping and/or display screen layout based on main object’s actions as per [0097]-[0104], [0108]-[0115], Figs. 7A, 7B, 13, 16 and their corresponding disclosure sections) including [0177]-[0188]},
wherein the determination unit determines whether the at least one candidate of the main object is the same or not between the image of interest and the at least one image captured within the predetermined period of time before the image of interest is captured, based on a skeletal structure represented by a positional relationship among the joints 
{see Fig. 4, [0090]-[0092], [0095] including tracking unit 332 that determines/tracks the main object 10 on the basis of color (e.g. of clothes) or by using only the posture estimation information as the main object moves from frame to frame (between the image of interest and the at least one image captured within the predetermined period of time before the image of interest is captured)},
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to have modified Kosiakova main object selection system which already discloses an object detecting unit configured to detect a plurality of objects from the images and a selection unit configured to select at least one candidate of a main object from among the plurality of objects such that the Kosiakova’s main object selection method includes an obtaining unit configured to obtain posture information about each of the plurality of objects by obtaining positions of a plurality of joints of each of the plurality of objects and wherein Kosiakova’s selection unit selects at least one candidate of a main object from among the plurality of objects on the basis of the posture information about the plurality objects for each image acquired by the acquisition unit as also taught by Shimauchi and wherein the determination unit determines whether the at least one candidate of the main object is the same or not between the image of interest and the at least one image captured within the predetermined period of time before the image of interest is captured, based on a skeletal structure represented by a positional relationship among the joints as further taught by Shimauchi because joint position enables more accurate determination of the main subject and permits enhanced layout decisions based on a determination that the main subject is, for example, walking as motivated by Shimauchi in [0108]-[0115], [0097]-[0104], Figs. 7A, 7B, 13, 16 and their corresponding disclosure sections) including [0177]-[0188]}, because Shimauchi teaches the equivalence of using color or posture to track/determine main object in [0095] such that Kosiaka’s color-based main object determination in [0053]-[0055] may be equivalently replaced by Shimauchi’s posture-based main object determination, because there is a reasonable expectation of success and/or because doing so merely combines prior art elements according to known methods to yield predictable results.
Claim 2
	In regards to claim 2, Kosiakova discloses wherein the object refers to a person or an animal {main object component 214 identifies the main person (human) in the video frame as per [0044]-[0046], [0052]-[0054]}.
Claim 5
	In regards to claim 5, Kosiakova discloses wherein the selection unit calculates reliability corresponding to a degree of possibility of being a main object for each of the objects {the automatic identification used by main object component 214 includes associated confidence levels (reliability) indicating probability of correct identification in [0044]-[0045], [0052]-[0053], [0066]-[0067], [0079], [0091]-[0092]}
Claim 7
	In regards to claim 7, Kosiakova discloses wherein the selection unit selects an object having a maximum value of the reliability as at least one candidate of the main object {the automatic identification used by main object component 214 includes associated confidence levels (reliability) indicating probability of correct identification and which the best match/maximum reliability is determined to be the main object as per [0044]-[0045], [0052]-[0053], [0066]-[0067], [0079], [0091]-[0092]}.


Claim 10
	In regards to claim 10, Kosiakova discloses wherein the selection unit does not select the candidate of the main object from an image that is not captured within the predetermined period of time before the image of interest is captured {see [0056] in which the video clips that are analyzed for main object determination are limited to one or three frames from each of the first and second video clips.  As such frames not captured within a predetermined period of time before the image of interest is captured (e.g. the fourth prior frame) are not used for main object determination and thus the selection unit does not select the candidate from such old frames}.
Claims 12 and 13
The rejection of apparatus claims 1 above applies mutatis mutandis to the corresponding limitations of method claim 12 and computer readable medium claim 13 while noting that the rejection above cites to device, method, and computer readable medium disclosures.  


Claims 4, 6, 8, 9, and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Kosiakova and Shimauchi as applied to claim 1 above, and further in view of Sugawara (JP 2018066889 A). A marked-up machine translation of Sugawara has been previously provided, all cross-references are with respect to this translation and the mark-ups are hereby incorporated by reference to further demonstrate claim mapping.  


Claim 4
	In regards to claim 4, Kosiakova discloses wherein object detecting unit detects 
a center of gravity of the corresponding object of the plurality of objects, position information indicating a part of a body of a corresponding object of the plurality of objects, or a position or motion vector of a corresponding object of the plurality of objects
{see [0054]-[0055] in which the feature amount is position indicating an associated part of the body of the object (e.g. red plaid shirt, baseball hat, glasses, brown jacket, white shoes.  See also block 308 determine continuity of main object.  See also [0062] in which the main object region of interest is propagated across additional frames for identifying the main object which likely includes position information of the object but position information is not explicitly disclosed}.
Sugawara is a highly related and analogous reference from the same field of main object determination and solves a similar problem of how to reliably determine the main object.  See abstract, field, pgs. 3 in which recognition unit 111 of the camera’s control device 110 tracks a main object (specific target) from a video sequence of plural time-sequential images.
Sugawara also teaches wherein object detecting unit detects a center of gravity of the corresponding object of the plurality of objects, position information indicating a part of a body of a corresponding object of the plurality of objects, or a position or motion vector of a corresponding object of the plurality of objects
{pgs. 9-10 Modification of first embodiment not only uses distance to main object (soccer ball) but also the moving direction (motion vector) of the candidate to determine main object (e.g. closest detected face moving in the same direction as the ball 151 is the main object.  See also Modification 2 which also considers position (depth) relative to the important object to determine main object in pgs. 10-11}.
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to have modified Kosiakova which already discloses  a related concept of determining feature weight using a variety of factors including distance to the main object such that the wherein the information about the feature amount is a center of gravity of the corresponding object of the plurality of objects, position information indicating a part of a body of a corresponding object of the plurality of objects, or a position or motion vector of a corresponding object of the plurality of objects as taught by Sugawara because doing so advantageously focusses the camera on the main object (soccer player) based on which player is closest to the ball and moving in the same direction thus permitting viewers to most reliably watch the most relevant player in control of the ball (closest thereto and moving in same direction) as the main object of the camera’s focus as motivated by Sugawara.
Claim 6
	In regards to claim 6, Kosiakova discloses wherein each of the plurality of objects refers to a person, and wherein the selection unit uses a distance between each person 
	Sugawara also teaches discloses wherein each of the plurality of objects refers to a person, and wherein the selection unit uses a distance between each person continued video capture by directing focus adjustment unit 114 to focus on the main object.  See pgs. 3-6, 8-9-also Figs. 2-6}.  
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to have modified Kosiakova which already discloses  a related concept of determining feature weight using a variety of factors including distance to the main object such that wherein each of the plurality of objects refers to a person, and wherein the selection unit uses a distance between each person and a ball to calculate the reliability as taught by Sugawara because doing so advantageously focusses the camera on the main object (soccer player) based on which player is closest to the ball thus permitting viewers to most reliably watch the player in control of the ball (closest thereto) as the main object of the camera’s focus as motivated by Sugawara.


Claim 8
In regards to claim 8, Kosiakova is not relied upon to disclose but Sugiwara teaches wherein the selection unit also selects an object having a value of the reliability different from the maximum value of the reliability by less than a predetermined value as at least one candidate of the main object {pgs. 9-10 Modification of first embodiment not only uses distance to main object (soccer ball) but also the moving direction (motion vector) of the candidate to determine main object (e.g. closest detected face that is moving in the same direction as the ball 151 is the main object.  See also Modification 2 which also considers position (depth) relative to the important object to determine main object in pgs. 10-11.  In both cases, the method select an object having a value of reliability different from the maximum (not the closest player) by less than a predetermined value (not closest but within the detection range) when the player object is moving in the same direction as the important object or closest in depth direction}
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to have modified Kosiakova which already discloses  a related concept of determining feature weight weight using a variety of factors including distance to the main object such that wherein the selection unit also selects an object having a value of the reliability different from the maximum value of the reliability by less than a predetermined value as the candidate of the main object as taught by Sugawara because doing so advantageously focusses the camera on the main object (soccer player) based on which player is closest to the ball and moving in the same direction thus permitting viewers to most reliably watch the most relevant player in control of the ball (closest thereto and moving in same direction or closest in depth direction) as the main object of the camera’s focus as motivated by Sugawara.
Claim 9
In regards to claim 9, Kosiakova is not relied upon to disclose but Sugiwara teaches a tracking unit configured to track the objects, wherein, in a case where the determination unit determines that at least one candidate of the main object are the same, the tracking unit changes a tracking target in the image of interest to the main object {see pgs. 1-6 in which control device detects and tracks a tracking target which may initially be the ball but which is changes to the main subject when the main subject detection unit 112 detects which player is closest to the ball within a detection range.  Also Focus adjustment unit 114 control the focus state to focus on the main subject detected by detection unit 112 and tracked by control unit}.
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to have modified Kosiakova to include a tracking unit configured to track the objects, wherein, in a case where the determination unit determines that the candidates of the main object are the same, the tracking unit changes a tracking target in the image of interest to the main object as taught by Sugawara because doing so advantageously tracks and focusses the camera on the main object (soccer player) based on which player is closest to the ball thus permitting viewers to most reliably watch the player in control of the ball (closest thereto) as the main object of the camera’s focus as motivated by Sugawara.

Claim 11
In regards to claim 8, Kosiakova 
the main object determination apparatus according to claim 1 {see above mapping of claim 1}.
Sugawara teaches an image capturing unit configured to capture an object image formed via an imaging optical system {Fig. 1 camera 100, pg. 2}
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to have modified Kosiakova to include an image capturing unit configured to capture an object image formed via an imaging optical system as taught by Sugawara because doing so enables adaptive tracking of the main object as motivated by Sugawara.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US 20220230438 A1 discloses main object detection and employs posture and action classification (e.g. basketball shooting) of main object to trigger camera operations such as slow motion video start as per [0162]-[0171].
Wagg (US 2009/0059007 A1 a main object determination apparatus comprising: 
one or more processors; and a memory storing instructions which, when executed by the one or more processors {Fig. 1 content processing workstation 10 including a Cell processor per [0033], , cause the main object determination apparatus to function as: 
an acquisition unit configured to acquire images captured at different timings {Fig. 1 including video camera(s) 20, 22.1, 22.2; , [0031], [0036] acquiring successive frames of video images (images captured at different timings)};
a selection unit configured to select a candidate of a main object from among objects using information about a feature point of each object in the images {See [0042] applying mean human shape to extract/select a candidate of a main object}
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael R Cammarata whose telephone number is (571)272-0113. The examiner can normally be reached M-Th 7am-5pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matthew Bella can be reached at 571-272-7778. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MICHAEL ROBERT CAMMARATA/Primary Examiner, Art Unit 2667

Read full office action

Prosecution Timeline

Dec 02, 2022

Application Filed

Feb 26, 2025

Non-Final Rejection — §103

Jul 15, 2025

Response Filed

Sep 08, 2025

Final Rejection — §103

Dec 18, 2025

Request for Continued Examination

Jan 16, 2026

Response after Non-Final Action

Jan 28, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/498,864

Patent 12602797

RECONSTRUCTION OF BODY MOTION USING A CAMERA SYSTEM

2y 5m to grant Granted Apr 14, 2026

17/955,837

Patent 12586171

METHODS AND SYSTEMS FOR GRADING DEVICES

2y 5m to grant Granted Mar 24, 2026

18/084,283

Patent 12579597

Point Group Data Synthesis Apparatus, Non-Transitory Computer-Readable Medium Having Recorded Thereon Point Group Data Synthesis Program, Point Group Data Synthesis Method, and Point Group Data Synthesis System

2y 5m to grant Granted Mar 17, 2026

18/280,955

Patent 12579835

INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND COMPUTER-READABLE RECORDING MEDIUM FOR DISTINGUISHING OBJECT AND SHADOW THEREOF IN IMAGE

2y 5m to grant Granted Mar 17, 2026

17/879,219

Patent 12567283

FACIAL RECOGNITION DATABASE USING FACE CLUSTERING

2y 5m to grant Granted Mar 03, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

70%

Grant Probability

99%

With Interview (+35.9%)

2y 4m

Median Time to Grant

High

PTA Risk

Based on 305 resolved cases by this examiner. Grant probability derived from career allow rate.