Last updated: April 19, 2026

Application No. 18/610,453

COMPUTER-READABLE RECORDING MEDIUM STORING TRACKING PROGRAM, TRACKING METHOD, AND INFORMATION PROCESSING APPARATUS

Non-Final OA §103§112

Filed

Mar 20, 2024

Examiner

WELLS, HEATH E

Art Unit

2664

Tech Center

2600 — Communications

Assignee

Fujitsu Limited

OA Round

1 (Non-Final)

Interview Optional

— +18.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 77 resolved cases, 2023–2026

Examiner Intelligence

WELLS, HEATH E View full profile →

Grants 75% — above average

Career Allow Rate

58 granted / 77 resolved

+13.3% vs TC avg

Strong +18% interview lift

Without

With

+18.1%

Interview Lift

resolved cases with interview

Typical timeline

3y 5m

Avg Prosecution

46 currently pending

Career history

123

Total Applications

across all art units

Statute-Specific Performance

§101

17.8%

-22.2% vs TC avg

§103

62.8%

+22.8% vs TC avg

§102

2.4%

-37.6% vs TC avg

§112

13.8%

-26.2% vs TC avg

Black line = Tech Center average estimate • Based on career data from 77 resolved cases

Office Action

§103 §112

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Receipt is acknowledged that application is a National Stage application of PCT PCT JP21/37415. Priority to PCT JP21/37415 with a priority date of 8 October 2021 is acknowledged under 35 USC 119(e) and 37 CFR 1.78.
Information Disclosure Statement
The IDSs dated 20 March 2024 and 27 November 2024 have been considered and placed in the application file.  
Claim Interpretation
Under MPEP 2143.03, "All words in a claim must be considered in judging the patentability of that claim against the prior art." In re Wilson, 424 F.2d 1382, 1385, 165 USPQ 494, 496 (CCPA 1970).  As a general matter, the grammar and ordinary meaning of terms as understood by one having ordinary skill in the art used in a claim will dictate whether, and to what extent, the language limits the claim scope. Language that suggests or makes a feature or step optional but does not require that feature or step does not limit the scope of a claim under the broadest reasonable claim interpretation. 
Under SuperGuide Corp. v. DirecTV Enters., Inc., 358 F.3d 870 (Fed. Cir. 2004), “the phrase ‘at least one of’ precedes a series of categories of criteria, and the patentee used the term ‘and’ to separate the categories of criteria, which connotes a conjunctive list. The district court correctly interpreted this phrase as requiring that the user select at least one value for each category; that is, at least one of a desired program start time, a desired program end time, a desired program service, and a desired program type.”, SuperGuide, 358 F.3d at 886.  
Claims 1-4, 6-9 and 11-15 recite “and.” Since “and”  is conjunctive, all of the elements must be found in the prior art to reject the claim.  
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. § 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 4, 9 and 14 are rejected under 35 U.S.C. § 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.    Claims 4, 9 and 14 recite “the position of the head of the person in the three dimension at a second timing between the first timing and the second timing.”  It is unclear how the second timing can be between the first and the second timing.  For the purpose of prior art analysis, Examiner assumes applicant meant that the second timing was between the first and the third timing.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
[AltContent: textbox (Uchiyama et al. Fig. 9, showing using epipolar information from multiple cameras to determine the position of a head.)]
    PNG
    media_image1.png
    534
    327
    media_image1.png
    Greyscale
Claims 1-15 are rejected under 35 U.S.C. 103 as obvious over US Patent Publication 2017 0154424 A1, (Uchiyama et al.).
Claim 1
 Regarding Claim 1, Uchiyama et al. teach a non-transitory computer-readable recording medium storing a tracking program ("The object tracking unit 103 tracks the object, in the camera image, detected by the object detection unit 102," paragraph [0036]) causing a computer to execute a process of:
specifying a head region of a person from each of a plurality of images captured by a plurality of cameras ("a position of the head of a person, in the camera image, detected by the object detection unit 102," paragraph [0036]);
specifying a set of head regions corresponding to a same person based on each of positions of the head regions specified from the plurality of images ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] where from each camera means there are a plurality of cameras and images); and
specifying a position of a head of the person in a three dimension based on a position of the set of the head regions corresponding to the same person in a two dimension and parameters of the plurality of cameras ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] and "A fundamental matrix F including information on the positional relationship between the first and the second camera images can be calculated based on the positions, the orientations, and the intrinsic parameters of the cameras in the camera information stored in the camera information storage unit," paragraph [0059]).
It is recognized that the citations and evidence provided above are derived from potentially different embodiments of a single reference.  Nevertheless, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to employ combinations and sub-combinations of these complementary embodiments, because Uchiyama et al. explicitly motivates doing so at least in paragraphs [0027], [00] and [0110] including “The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.” and otherwise motivating experimentation and optimization.
The rejection of system claim 1 above applies mutatis mutandis to the corresponding limitations of method claim 6 and apparatus claim 11 while noting that the rejection above cites to both device and method disclosures.  Claims 6 and 11 are mapped below for clarity of the record and to specify any new limitations not included in claim 1.
Claim 2
 Regarding claim 2, Uchiyama et al. teach the non-transitory computer-readable recording medium according to claim 1, wherein the specifying the set of the head regions includes specifying whether, based on a distance between an epipolar line ("When a distance between the representative point of a person in the second camera image and the epipolar line 801 is equal to or smaller than a predetermined value, the person in the second camera image matches the person in the first camera image," paragraph [0060]), which is included in a first image and corresponds to a second head region included in a second image, and a first head region included in the first image, the first head region and the second head region correspond to the head region corresponding to the same person ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] where from each camera means there are a plurality of cameras and images).
Claim 3
 Regarding claim 3, Uchiyama et al. teach the non-transitory computer-readable recording medium according to claim 2, wherein the distance is corrected based on a size of the first head region and a size of the second head region ("In the present exemplary embodiment, the object attribute acquisition unit 104 acquires the information on the height of the object. Alternatively, the object attribute acquisition unit 104 may acquire information on the size of the object. [0039] The object attribute acquisition unit 104 includes a first position estimation unit 110," paragraph [0038] where a position estimation unit corrects distance, in this case using information on the size).
Claim 4
 Regarding claim 4, Uchiyama et al. teach the non-transitory computer-readable recording medium according to claim 1, wherein the process further includes: estimating, based on the position of the head of the person in the three dimension which is specified based on the plurality of images captured by each of the plurality of cameras at a first timing and a third timing, the position of the head of the person in the three dimension at a second timing between the first timing and the second timing ("Thus, the matching person is tracked over a plurality of time points. As a result of the processing, a tracking label I is obtained for each person. The tracking label i is a code for identifying each tracked person," paragraph [0049] where tracking a person over time includes teaching to interpolate positions between the timings).
Claim 5
 Regarding claim 5, Uchiyama et al. teach the non-transitory computer-readable recording medium according to claim 1, wherein the specifying the position of the head of the person in the three dimension further includes:
specifying trajectory information of the position of the head of the person in the three dimension for each window section including continuous image frames ("Symbols (frames) 1202, each representing the person area tracked in step S403, are overlapped on the camera image 1201. The frames of the same person in different camera images are colored with the same color, so that the user can recognize the same person in different camera images," paragraph [0079]): and
associating the trajectory information for each window section ("A person in the camera image 1201 and the same person in the 3D map 1204 are colored with the same color, so that the user can easily identify the same person in the camera image 1201 and the 3D map 1204," paragraph [0080]).
Claim 6
 Regarding claim 6, Uchiyama et al. teach a tracking method  ("The object tracking unit 103 tracks the object, in the camera image, detected by the object detection unit 102," paragraph [0036]) comprising:
specifying a head region of a person from each of a plurality of images captured by a plurality of cameras ("a position of the head of a person, in the camera image, detected by the object detection unit 102," paragraph [0036]);
specifying a set of head regions corresponding to a same person based on each of positions of the head regions specified from the plurality of images ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] where from each camera means there are a plurality of cameras and images); and
specifying a position of a head of the person in a three dimension based on a position of the set of the head regions corresponding to the same person in a two dimension and parameters of the plurality of cameras ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] and "A fundamental matrix F including information on the positional relationship between the first and the second camera images can be calculated based on the positions, the orientations, and the intrinsic parameters of the cameras in the camera information stored in the camera information storage unit," paragraph [0059]).
Claim 7
 Regarding claim 7, Uchiyama et al. teach the tracking method according to claim 6, wherein the specifying the set of the head regions includes specifying whether, based on a distance between an epipolar line ("When a distance between the representative point of a person in the second camera image and the epipolar line 801 is equal to or smaller than a predetermined value, the person in the second camera image matches the person in the first camera image," paragraph [0060]), which is included in a first image and corresponds to a second head region included in a second image, and a first head region included in the first image, the first head region and the second head region correspond to the head region corresponding to the same person ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] where from each camera means there are a plurality of cameras and images).
Claim 8
 Regarding claim 8, Uchiyama et al. teach the tracking program according to claim 7, wherein the distance is corrected based on a size of the first head region and a size of the second head region ("In the present exemplary embodiment, the object attribute acquisition unit 104 acquires the information on the height of the object. Alternatively, the object attribute acquisition unit 104 may acquire information on the size of the object. [0039] The object attribute acquisition unit 104 includes a first position estimation unit 110," paragraph [0038] where a position estimation unit corrects distance, in this case using information on the size).
Claim 9
 Regarding claim 9, Uchiyama et al. teach the tracking method according to claim 6, further comprising:
estimating, based on the position of the head of the person in the three dimension which is specified based on the plurality of images captured by each of the plurality of cameras at a first timing and a third timing, the position of the head of the person in the three dimension at a second timing between the first timing and the second timing ("Thus, the matching person is tracked over a plurality of time points. As a result of the processing, a tracking label I is obtained for each person. The tracking label i is a code for identifying each tracked person," paragraph [0049] where tracking a person over time includes teaching to interpolate positions between the timings).
Claim 10
 Regarding claim 10, Uchiyama et al. teach the tracking method according to claim 6, wherein the specifying the position of the head of the person in the three dimension further includes:
specifying trajectory information of the position of the head of the person in the three dimension for each window section including continuous image frames ("Symbols (frames) 1202, each representing the person area tracked in step S403, are overlapped on the camera image 1201. The frames of the same person in different camera images are colored with the same color, so that the user can recognize the same person in different camera images," paragraph [0079]): and
associating the trajectory information for each window section ("A person in the camera image 1201 and the same person in the 3D map 1204 are colored with the same color, so that the user can easily identify the same person in the camera image 1201 and the 3D map 1204," paragraph [0080]).
Claim 11
 Regarding claim 11, Uchiyama et al. teach an information processing apparatus  ("The object tracking unit 103 tracks the object, in the camera image, detected by the object detection unit 102," paragraph [0036]) comprising:
a memory ("The storage medium may include, for example, one or more of a hard disk, a random-access memory (RAM), a read only memory (ROM), a storage of distributed computing systems, an optical disk (such as a compact disc (CD), digital versatile disc (DVD), or Blu-ray Disc (BD)™), a flash memory device, a memory card, and the like," paragraph [0109]); and
a processor coupled to the memory  ("one or more processors in the computers in the system or the device read out and execute the program," paragraph [0108]) and configured to:
specify a head region of a person from each of a plurality of images captured by a plurality of cameras ("a position of the head of a person, in the camera image, detected by the object detection unit 102," paragraph [0036]);
specify a set of head regions corresponding to a same person based on each of positions of the head regions specified from the plurality of images ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] where from each camera means there are a plurality of cameras and images); and
specify a position of a head of the person in a three dimension based on a position of the set of the head regions corresponding to the same person in a two dimension and parameters of the plurality of cameras ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] and "A fundamental matrix F including information on the positional relationship between the first and the second camera images can be calculated based on the positions, the orientations, and the intrinsic parameters of the cameras in the camera information stored in the camera information storage unit," paragraph [0059]).
Claim 12
 Regarding claim 12, Uchiyama et al. teach the information processing apparatus according to claim 11, wherein the processor specifies whether, based on a distance between an epipolar line ("When a distance between the representative point of a person in the second camera image and the epipolar line 801 is equal to or smaller than a predetermined value, the person in the second camera image matches the person in the first camera image," paragraph [0060]), which is included in a first image and corresponds to a second head region included in a second image, and a first head region included in the first image, the first head region and the second head region correspond to the head region corresponding to the same person ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] where from each camera means there are a plurality of cameras and images).
Claim 13
 Regarding claim 13, Uchiyama et al. teach the information processing apparatus according to claim 12, wherein the processor corrects the distance based on a size of the first head region and a size of the second head region ("In the present exemplary embodiment, the object attribute acquisition unit 104 acquires the information on the height of the object. Alternatively, the object attribute acquisition unit 104 may acquire information on the size of the object. [0039] The object attribute acquisition unit 104 includes a first position estimation unit 110," paragraph [0038] where a position estimation unit corrects distance, in this case using information on the size).
Claim 14
 Regarding claim 14, Uchiyama et al. teach the information processing apparatus according to claim 11, wherein the processor estimates, based on the position of the head of the person in the three dimension which is specified based on the plurality of images captured by each of the plurality of cameras at a first timing and a third timing, the position of the head of the person in the three dimension at a second timing between the first timing and the second timing ("Thus, the matching person is tracked over a plurality of time points. As a result of the processing, a tracking label I is obtained for each person. The tracking label i is a code for identifying each tracked person," paragraph [0049] where tracking a person over time includes teaching to interpolate positions between the timings).
Claim 15
 Regarding claim 15, Uchiyama et al. teach the information processing apparatus according to claim 11, wherein the processor specifies trajectory information of the position of the head of the person in the three dimension for each window section including continuous image frames ("Symbols (frames) 1202, each representing the person area tracked in step S403, are overlapped on the camera image 1201. The frames of the same person in different camera images are colored with the same color, so that the user can recognize the same person in different camera images," paragraph [0079]) and associate the trajectory information for each window section ("A person in the camera image 1201 and the same person in the 3D map 1204 are colored with the same color, so that the user can easily identify the same person in the camera image 1201 and the 3D map 1204," paragraph [0080]).

Reference Cited
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
US Patent Publication 2023 0326063 A1 to Oami discloses a detection unit (2020), a state estimation unit (2040), and a height estimation unit (2080). The detection unit (2020) detects a target person from a video frame. The state estimation unit (2040) estimates a state of the detected target person. The height estimation unit (2080) estimates a height of the person on the basis of a height of the target person in the video frame in a case where the estimated state satisfies a predetermined condition.
Non Patent Publication “Geometry-Based Multiple Camera Head Detection in Dense Crowds” to Pellicano et al. discloses head detection in crowded environments.
US Patent Publication 2012 0293667 A1 to Baba et al. discloses obtaining first images of a region of interest (ROI) to be imaged and associated with a first time, where the first images are associated with different positions and orientations with respect to the ROI. The method also includes defining an active region in the each of the first images and selecting intrinsic features in each of the first images based on the active region. Second, identifying a portion of the intrinsic features temporally and spatially matching intrinsic features in corresponding ones of second images of the ROI associated with a second time prior to the first time and computing three-dimensional (3D) coordinates for the portion of the intrinsic features. Finally, the method includes computing a relative pose for the first images based on the 3D coordinates.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HEATH E WELLS whose telephone number is (703)756-4696. The examiner can normally be reached Monday-Friday 8:00-4:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ms. Jennifer Mehmood can be reached on 571-272-2976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/Heath E. Wells/Examiner, Art Unit 2664


Date: 14 January 2026

Read full office action

Prosecution Timeline

Mar 20, 2024

Application Filed

Jan 14, 2026

Non-Final Rejection — §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/232,212

Patent 12602755

DEEP LEARNING-BASED HIGH RESOLUTION IMAGE INPAINTING

2y 5m to grant Granted Apr 14, 2026

17/783,931

Patent 12597226

METHOD AND SYSTEM FOR AUTOMATED PLANT IMAGE LABELING

2y 5m to grant Granted Apr 07, 2026

17/620,452

Patent 12591979

IMAGE GENERATION METHOD AND DEVICE

2y 5m to grant Granted Mar 31, 2026

17/828,545

Patent 12588876

TARGET AREA DETERMINATION METHOD AND MEDICAL IMAGING SYSTEM

2y 5m to grant Granted Mar 31, 2026

17/991,910

Patent 12586363

GENERATION OF PLURAL IMAGES HAVING M-BIT DEPTH PER PIXEL BY CLIPPING M-BIT SEGMENTS FROM MUTUALLY DIFFERENT POSITIONS IN IMAGE HAVING N-BIT DEPTH PER PIXEL

2y 5m to grant Granted Mar 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

75%

Grant Probability

93%

With Interview (+18.1%)

3y 5m

Median Time to Grant

Low

PTA Risk

Based on 77 resolved cases by this examiner. Grant probability derived from career allow rate.