Last updated: April 18, 2026

Application No. 18/019,879

CAUSAL INTERACTION DETECTION APPARATUS, CONTROL METHOD, AND COMPUTER-READABLE STORAGE MEDIUM

Final Rejection §102

Filed

Feb 06, 2023

Examiner

SHERRILLO, DYLAN JOSEPH

Art Unit

2665

Tech Center

2600 — Communications

Assignee

NEC Corporation

OA Round

2 (Final)

Interview Optional

— +11.8% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 43 resolved cases, 2023–2026

Examiner Intelligence

SHERRILLO, DYLAN JOSEPH View full profile →

Grants 91% — above average

Career Allow Rate

39 granted / 43 resolved

+28.7% vs TC avg

Moderate +12% lift

Without

With

+11.8%

Interview Lift

resolved cases with interview

Typical timeline

2y 11m

Avg Prosecution

14 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

6.2%

-33.8% vs TC avg

§103

46.9%

+6.9% vs TC avg

§102

42.3%

+2.3% vs TC avg

§112

2.5%

-37.5% vs TC avg

Black line = Tech Center average estimate • Based on career data from 43 resolved cases

Office Action

§102

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 02/06/2023 are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.

Status of Claim(s)
Claim(s) 1, 4, 6-9, 12, 14-17 and 20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Andriluka (US 9058663 B2).
Claim(s) 2-3, 5, 10-11, 13, and 18-19 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1, 4, 6-9, 12, 14-17 and 20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Andriluka (US 9058663 B2).

Regarding Claim 1:
	Andriluka (US 9058663 B2) teaches: A causal interaction detection apparatus comprising (Col 2. Lines 31-37, “As part of recovering 3D pose estimates, the techniques described herein explicitly account for interactions between people in the recorded video. More specifically, individual subjects are treated as mutual "context" for one another. One embodiment provides an automatic framework for estimating 3D pose of interacting people performing complex activities from monocular observations.”): 
at least one processor (Col 1. Lines 51-53, “This method may further include generating, by operation of one or more computer processors, …”); and 
memory storing instructions (Col 4. Lines 50-57, “More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM),”);
wherein the at least one processor is configured to execute the instructions to (Col 4. Lines 50-57, “More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM),”): 
extract pose information for each of persons detected from a video data, the pose information indicating poses of the person in a time series (Col 7. Lines 25-28, “More specifically, as a first step towards inferring 2D or 3D poses of people, positions of potentially interacting people are recovered from the video sequence and tracked over time (i.e., over multiple times).”); 
generate, for each of the persons, a change model that shows change in pose over time based on the pose information (Col 8. Lines 57-67, “Further, the pose recovery tool may use tree-structured pictorial structures (PS). However, embodiments (1) add flexibility to parts allowing the model to effectively capture foreshortening and different views of the person with respect to the camera and (2) condition the pose estimates on person detections to deal with image over-counting and encourage consistent pose estimates over time, and (3) utilize a collection of multi-person pictorial structure models that encode modes of interactions between two people (aspects). This new model is referred to herein as a multi-aspect flexible pictorial structure (MaFPS) model.”); 
determine, for each of one or more sets of a plurality of the persons, whether times of changes in pose of the persons in the set correlate with each other (Col 3. Lines 22-40, “Note, the discussion below uses a pair of individuals dancing as an example of a joint, correlated activity which constrains 2D and 3D pose of the two individuals engaging in the activity. Of course, one of ordinary skill in the art will recognize that the techniques for recovering 2D and 3D pose are in no way limited to dancing, and that embodiments of the invention may be readily adapted to recover 2D and 3D pose estimates for two (or more) individuals engaging a variety of different correlated activities (whether complex, periodic, or otherwise). For example, the techniques described herein may be used to estimate the pose of athletes interacting with one another, e.g., boxers in the ring. Doing so may facilitate a virtual replay and analysis of their motion. Another example includes estimates of 3D pose for people involved in semi-periodic interactions (e.g., a play between defender and striker in an attempt to score a goal during a soccer match). Another example is estimates of 3D pose of people involved in a correlated but potentially spatially disjoint activity (e.g., a Simon says game).”); and 
detect the persons whose times of changes in pose are determined to correlate with each other, as the persons having a causal relationship with each other (Col 4. Lines 5-12, “What makes this unique (gaming) is the pose recovery based on grouping people engaged in a correlated activity with one another, using proximity detections of interacting persons (either spatially or temporally/statistically). Those correlated activities imply a context that assists the pose recovery tool for each individual. The correlated activity is also used to infer a constrained post, which is used to recover and estimate a 3D pose.”).

Regarding Claim 4:
	Andriluka teaches: The causal interaction detection apparatus according to claim 1, 
wherein the at least one processor is further configured to: 
determine whether a distance between a first person and a second person is equal to or less than a threshold (Col 8. Lines 21-31, Two people are dancing within the threshold of a distance range and are grouped together by activity); and 
determine the first person does not have a causal relationship with the second person when the distance is determined to be larger than the threshold (Col 8. Lines 21-25, Associations between people are only made if the distance is less than a predetermined threshold relating to distance).

Regarding Claim 6:
	Andriluka teaches: The causal interaction detection apparatus according to claim 1, 
wherein the pose information represents the pose of the person at a frame of the video data by co-ordinates of multiple keypoints of the person detected from the frame (Figure 4A and 4B, Points of human points are identified and their coordinate location is shown in 4B).

	Regarding Claim 7:
	Andriluka teaches: The causal interaction detection apparatus according to claim 1, 
wherein the change model of the person represents the change in pose of the person at a frame of the video data by a dissimilarity value between the pose of the person at the frame and a reference pose (Col 13. Lines 15-25, Calculating changes in pose compared to a reference model where in the difference is represented as delta).

	Regarding Claim 8:
	Andriluka teaches: The causal interaction detection apparatus according to claim 7, wherein the dissimilarity value for the person at a frame of the video data is computed as a distance between the pose of the person at the frame and the reference pose (Col 13. Lines 15-25, Difference between pose of two different subjects in reference frame compared to another frame).

Regarding Claim 9:
	Andriluka teaches: A control method performed by a computer, comprising (Col 1. Lines 51-57, “This method may further include generating, by operation of one or more computer processors, a 2D pose estimation for at least the first person. The 2D pose estimation is generated, at least in part, to account for constraints on positions of body parts of the first and second person resulting from participating in the correlated activity.”): 
extracting pose information for each of persons detected from a video data, the pose information indicating poses of the person in a time series (Col 7. Lines 25-28, “More specifically, as a first step towards inferring 2D or 3D poses of people, positions of potentially interacting people are recovered from the video sequence and tracked over time (i.e., over multiple times).”); 
generating, for each of the persons, a change model that shows change in pose over time based on the pose information (Col 8. Lines 57-67, “Further, the pose recovery tool may use tree-structured pictorial structures (PS). However, embodiments (1) add flexibility to parts allowing the model to effectively capture foreshortening and different views of the person with respect to the camera and (2) condition the pose estimates on person detections to deal with image over-counting and encourage consistent pose estimates over time, and (3) utilize a collection of multi-person pictorial structure models that encode modes of interactions between two people (aspects). This new model is referred to herein as a multi-aspect flexible pictorial structure (MaFPS) model.”); 
determining, for each of one or more sets of a plurality of the persons, whether times of changes in pose of the persons in the set correlate with each other (Col 3. Lines 22-40, “Note, the discussion below uses a pair of individuals dancing as an example of a joint, correlated activity which constrains 2D and 3D pose of the two individuals engaging in the activity. Of course, one of ordinary skill in the art will recognize that the techniques for recovering 2D and 3D pose are in no way limited to dancing, and that embodiments of the invention may be readily adapted to recover 2D and 3D pose estimates for two (or more) individuals engaging a variety of different correlated activities (whether complex, periodic, or otherwise). For example, the techniques described herein may be used to estimate the pose of athletes interacting with one another, e.g., boxers in the ring. Doing so may facilitate a virtual replay and analysis of their motion. Another example includes estimates of 3D pose for people involved in semi-periodic interactions (e.g., a play between defender and striker in an attempt to score a goal during a soccer match). Another example is estimates of 3D pose of people involved in a correlated but potentially spatially disjoint activity (e.g., a Simon says game).”); and 
detecting the persons whose times of changes in pose are determined to correlate with each other, as the persons having a causal relationship with each other (Col 4. Lines 5-12, “What makes this unique (gaming) is the pose recovery based on grouping people engaged in a correlated activity with one another, using proximity detections of interacting persons (either spatially or temporally/statistically). Those correlated activities imply a context that assists the pose recovery tool for each individual. The correlated activity is also used to infer a constrained post, which is used to recover and estimate a 3D pose.”).

	Regarding Claim 12:
	Andriluka teaches: The control method according to claim 9, further comprising: 
determining whether a distance between a first person and a second person is equal to or less than a threshold (Col 8. Lines 21-31, Two people are dancing within the threshold of a distance range and are grouped together by activity); and 
determining the first person does not have a causal relationship with the second person when the distance is determined to be larger than the threshold (Col 8. Lines 21-25, Associations between people are only made if the distance is less than a predetermined threshold relating to distance).

Regarding Claim 14:
	Andriluka teaches: The control method according to claim 9, 
wherein the pose information represents the pose of the person at a frame of the video data by co-ordinates of multiple keypoints of the person detected from the frame (Figure 4A and 4B, Points of human points are identified and their coordinate location is shown in 4B).

	Regarding Claim 15:
	Andriluka teaches: The control method according to claim 9, 
wherein the change model of the person represents the change in pose of the person at a frame of the video data by a dissimilarity value between the pose of the person at the frame and a reference pose (Col 13. Lines 15-25, Calculating changes in pose compared to a reference model where in the difference is represented as delta).

	Regarding Claim 16:
	Andriluka teaches: The control method according to claim 15, 
wherein the dissimilarity value for the person at a frame of the video data is computed as a distance between the pose of the person at the frame and the reference pose (Col 13. Lines 15-25, Difference between pose of two different subjects in reference frame compared to another frame).

	Regarding Claim 17:
	Andriluka teaches: A non-transitory computer-readable storage medium storing a program that cause a computer to execute: 
extracting pose information for each of persons detected from a video data, the pose information indicating poses of the person in a time series (Col 7. Lines 25-28, “More specifically, as a first step towards inferring 2D or 3D poses of people, positions of potentially interacting people are recovered from the video sequence and tracked over time (i.e., over multiple times).”); 
generating, for each of the persons, a change model that shows change in pose over time based on the pose information (Col 8. Lines 57-67, “Further, the pose recovery tool may use tree-structured pictorial structures (PS). However, embodiments (1) add flexibility to parts allowing the model to effectively capture foreshortening and different views of the person with respect to the camera and (2) condition the pose estimates on person detections to deal with image over-counting and encourage consistent pose estimates over time, and (3) utilize a collection of multi-person pictorial structure models that encode modes of interactions between two people (aspects). This new model is referred to herein as a multi-aspect flexible pictorial structure (MaFPS) model.”); 
determining, for each of one or more sets of a plurality of the persons, whether times of changes in pose of the persons in the set correlate with each other (Col 3. Lines 22-40, “Note, the discussion below uses a pair of individuals dancing as an example of a joint, correlated activity which constrains 2D and 3D pose of the two individuals engaging in the activity. Of course, one of ordinary skill in the art will recognize that the techniques for recovering 2D and 3D pose are in no way limited to dancing, and that embodiments of the invention may be readily adapted to recover 2D and 3D pose estimates for two (or more) individuals engaging a variety of different correlated activities (whether complex, periodic, or otherwise). For example, the techniques described herein may be used to estimate the pose of athletes interacting with one another, e.g., boxers in the ring. Doing so may facilitate a virtual replay and analysis of their motion. Another example includes estimates of 3D pose for people involved in semi-periodic interactions (e.g., a play between defender and striker in an attempt to score a goal during a soccer match). Another example is estimates of 3D pose of people involved in a correlated but potentially spatially disjoint activity (e.g., a Simon says game).”); and 
detecting the persons whose times of changes in pose are determined to correlate with each other, as the persons having a causal relationship with each other (Col 4. Lines 5-12, “What makes this unique (gaming) is the pose recovery based on grouping people engaged in a correlated activity with one another, using proximity detections of interacting persons (either spatially or temporally/statistically). Those correlated activities imply a context that assists the pose recovery tool for each individual. The correlated activity is also used to infer a constrained post, which is used to recover and estimate a 3D pose.”).
 
	Regarding Claim 20:
	Andriluka teaches: The non-transitory computer-readable storage medium according to claim 17 (Col 4. Lines 50-58, “More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.”), 
wherein the program further causes the computer to execute: 
determining whether a distance between a first person and a second person is equal to or less than a threshold (Col 8. Lines 21-31, Two people are dancing within the threshold of a distance range and are grouped together by activity); and 
determining the first person does not have a causal relationship with the second person when the distance is determined to be larger than the threshold (Col 8. Lines 21-25, Associations between people are only made if the distance is less than a predetermined threshold relating to distance).

Allowable Subject Matter
Claim(s) 2-3, 5, 10-11, 13, and 18-19 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Relevant Art Directed to State of Art
ISHIHARA (US 20210304434 A1)
Hanamoto (US 20190213791 A1)
KUSANO (JP 6887586 B1)

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DYLAN J SHERRILLO whose telephone number is (703)756-5605. The examiner can normally be reached 1st Week of Bi-week Monday - Thursday 10am - 7:30pm EST, 2nd Week of Bi-week Monday-Thursday 10am - 7:30pm EST Friday 10am-6:30pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Stephen R Koziol can be reached at (408) 918-7630. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/D.J.S./Examiner, Art Unit 2665      

/Stephen R Koziol/Supervisory Patent Examiner, Art Unit 2665

Read full office action

Prosecution Timeline

Feb 06, 2023

Application Filed

Sep 26, 2025

Non-Final Rejection — §102

Jan 02, 2026

Response Filed

Apr 07, 2026

Final Rejection — §102 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/453,389

Patent 12591907

SYSTEM AND METHOD TO DETECT A GAZE AT AN OBJECT BY UTILIZING AN IMAGE SENSOR

2y 5m to grant Granted Mar 31, 2026

18/064,132

Patent 12579798

IMAGE PROCESSING METHOD AND APPARATUS

2y 5m to grant Granted Mar 17, 2026

18/081,195

Patent 12567166

DEVICE FOR PROCESSING IMAGE AND OPERATING METHOD THEREOF

2y 5m to grant Granted Mar 03, 2026

17/928,087

Patent 12541825

MODEL TRAINING METHOD, IMAGE PROCESSING METHOD, COMPUTING AND PROCESSING DEVICE AND NON-TRANSIENT COMPUTER-READABLE MEDIUM

2y 5m to grant Granted Feb 03, 2026

18/146,637

Patent 12530826

CORRECTION OF ARTIFACTS OF TOMOGRAPHIC RECONSTRUCTIONS BY NEURON NETWORKS

2y 5m to grant Granted Jan 20, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

91%

Grant Probability

99%

With Interview (+11.8%)

2y 11m

Median Time to Grant

Moderate

PTA Risk

Based on 43 resolved cases by this examiner. Grant probability derived from career allow rate.