DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Receipt is acknowledged that application is a National Stage application of PCT PCT JP21/37415. Priority to PCT JP21/37415 with a priority date of 8 October 2021 is acknowledged under 35 USC 119(e) and 37 CFR 1.78.
Information Disclosure Statement
The IDSs dated 20 March 2024 and 27 November 2024 have been considered and placed in the application file.
Claim Interpretation
Under MPEP 2143.03, "All words in a claim must be considered in judging the patentability of that claim against the prior art." In re Wilson, 424 F.2d 1382, 1385, 165 USPQ 494, 496 (CCPA 1970). As a general matter, the grammar and ordinary meaning of terms as understood by one having ordinary skill in the art used in a claim will dictate whether, and to what extent, the language limits the claim scope. Language that suggests or makes a feature or step optional but does not require that feature or step does not limit the scope of a claim under the broadest reasonable claim interpretation.
Under SuperGuide Corp. v. DirecTV Enters., Inc., 358 F.3d 870 (Fed. Cir. 2004), “the phrase ‘at least one of’ precedes a series of categories of criteria, and the patentee used the term ‘and’ to separate the categories of criteria, which connotes a conjunctive list. The district court correctly interpreted this phrase as requiring that the user select at least one value for each category; that is, at least one of a desired program start time, a desired program end time, a desired program service, and a desired program type.”, SuperGuide, 358 F.3d at 886.
Claims 1-4, 6-9 and 11-15 recite “and.” Since “and” is conjunctive, all of the elements must be found in the prior art to reject the claim.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. § 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
Claims 4, 9 and 14 are rejected under 35 U.S.C. § 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention. Claims 4, 9 and 14 recite “the position of the head of the person in the three dimension at a second timing between the first timing and the second timing.” It is unclear how the second timing can be between the first and the second timing. For the purpose of prior art analysis, Examiner assumes applicant meant that the second timing was between the first and the third timing.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
[AltContent: textbox (Uchiyama et al. Fig. 9, showing using epipolar information from multiple cameras to determine the position of a head.)]
PNG
media_image1.png
534
327
media_image1.png
Greyscale
Claims 1-15 are rejected under 35 U.S.C. 103 as obvious over US Patent Publication 2017 0154424 A1, (Uchiyama et al.).
Claim 1
Regarding Claim 1, Uchiyama et al. teach a non-transitory computer-readable recording medium storing a tracking program ("The object tracking unit 103 tracks the object, in the camera image, detected by the object detection unit 102," paragraph [0036]) causing a computer to execute a process of:
specifying a head region of a person from each of a plurality of images captured by a plurality of cameras ("a position of the head of a person, in the camera image, detected by the object detection unit 102," paragraph [0036]);
specifying a set of head regions corresponding to a same person based on each of positions of the head regions specified from the plurality of images ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] where from each camera means there are a plurality of cameras and images); and
specifying a position of a head of the person in a three dimension based on a position of the set of the head regions corresponding to the same person in a two dimension and parameters of the plurality of cameras ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] and "A fundamental matrix F including information on the positional relationship between the first and the second camera images can be calculated based on the positions, the orientations, and the intrinsic parameters of the cameras in the camera information stored in the camera information storage unit," paragraph [0059]).
It is recognized that the citations and evidence provided above are derived from potentially different embodiments of a single reference. Nevertheless, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to employ combinations and sub-combinations of these complementary embodiments, because Uchiyama et al. explicitly motivates doing so at least in paragraphs [0027], [00] and [0110] including “The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.” and otherwise motivating experimentation and optimization.
The rejection of system claim 1 above applies mutatis mutandis to the corresponding limitations of method claim 6 and apparatus claim 11 while noting that the rejection above cites to both device and method disclosures. Claims 6 and 11 are mapped below for clarity of the record and to specify any new limitations not included in claim 1.
Claim 2
Regarding claim 2, Uchiyama et al. teach the non-transitory computer-readable recording medium according to claim 1, wherein the specifying the set of the head regions includes specifying whether, based on a distance between an epipolar line ("When a distance between the representative point of a person in the second camera image and the epipolar line 801 is equal to or smaller than a predetermined value, the person in the second camera image matches the person in the first camera image," paragraph [0060]), which is included in a first image and corresponds to a second head region included in a second image, and a first head region included in the first image, the first head region and the second head region correspond to the head region corresponding to the same person ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] where from each camera means there are a plurality of cameras and images).
Claim 3
Regarding claim 3, Uchiyama et al. teach the non-transitory computer-readable recording medium according to claim 2, wherein the distance is corrected based on a size of the first head region and a size of the second head region ("In the present exemplary embodiment, the object attribute acquisition unit 104 acquires the information on the height of the object. Alternatively, the object attribute acquisition unit 104 may acquire information on the size of the object. [0039] The object attribute acquisition unit 104 includes a first position estimation unit 110," paragraph [0038] where a position estimation unit corrects distance, in this case using information on the size).
Claim 4
Regarding claim 4, Uchiyama et al. teach the non-transitory computer-readable recording medium according to claim 1, wherein the process further includes: estimating, based on the position of the head of the person in the three dimension which is specified based on the plurality of images captured by each of the plurality of cameras at a first timing and a third timing, the position of the head of the person in the three dimension at a second timing between the first timing and the second timing ("Thus, the matching person is tracked over a plurality of time points. As a result of the processing, a tracking label I is obtained for each person. The tracking label i is a code for identifying each tracked person," paragraph [0049] where tracking a person over time includes teaching to interpolate positions between the timings).
Claim 5
Regarding claim 5, Uchiyama et al. teach the non-transitory computer-readable recording medium according to claim 1, wherein the specifying the position of the head of the person in the three dimension further includes:
specifying trajectory information of the position of the head of the person in the three dimension for each window section including continuous image frames ("Symbols (frames) 1202, each representing the person area tracked in step S403, are overlapped on the camera image 1201. The frames of the same person in different camera images are colored with the same color, so that the user can recognize the same person in different camera images," paragraph [0079]): and
associating the trajectory information for each window section ("A person in the camera image 1201 and the same person in the 3D map 1204 are colored with the same color, so that the user can easily identify the same person in the camera image 1201 and the 3D map 1204," paragraph [0080]).
Claim 6
Regarding claim 6, Uchiyama et al. teach a tracking method ("The object tracking unit 103 tracks the object, in the camera image, detected by the object detection unit 102," paragraph [0036]) comprising:
specifying a head region of a person from each of a plurality of images captured by a plurality of cameras ("a position of the head of a person, in the camera image, detected by the object detection unit 102," paragraph [0036]);
specifying a set of head regions corresponding to a same person based on each of positions of the head regions specified from the plurality of images ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] where from each camera means there are a plurality of cameras and images); and
specifying a position of a head of the person in a three dimension based on a position of the set of the head regions corresponding to the same person in a two dimension and parameters of the plurality of cameras ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] and "A fundamental matrix F including information on the positional relationship between the first and the second camera images can be calculated based on the positions, the orientations, and the intrinsic parameters of the cameras in the camera information stored in the camera information storage unit," paragraph [0059]).
Claim 7
Regarding claim 7, Uchiyama et al. teach the tracking method according to claim 6, wherein the specifying the set of the head regions includes specifying whether, based on a distance between an epipolar line ("When a distance between the representative point of a person in the second camera image and the epipolar line 801 is equal to or smaller than a predetermined value, the person in the second camera image matches the person in the first camera image," paragraph [0060]), which is included in a first image and corresponds to a second head region included in a second image, and a first head region included in the first image, the first head region and the second head region correspond to the head region corresponding to the same person ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] where from each camera means there are a plurality of cameras and images).
Claim 8
Regarding claim 8, Uchiyama et al. teach the tracking program according to claim 7, wherein the distance is corrected based on a size of the first head region and a size of the second head region ("In the present exemplary embodiment, the object attribute acquisition unit 104 acquires the information on the height of the object. Alternatively, the object attribute acquisition unit 104 may acquire information on the size of the object. [0039] The object attribute acquisition unit 104 includes a first position estimation unit 110," paragraph [0038] where a position estimation unit corrects distance, in this case using information on the size).
Claim 9
Regarding claim 9, Uchiyama et al. teach the tracking method according to claim 6, further comprising:
estimating, based on the position of the head of the person in the three dimension which is specified based on the plurality of images captured by each of the plurality of cameras at a first timing and a third timing, the position of the head of the person in the three dimension at a second timing between the first timing and the second timing ("Thus, the matching person is tracked over a plurality of time points. As a result of the processing, a tracking label I is obtained for each person. The tracking label i is a code for identifying each tracked person," paragraph [0049] where tracking a person over time includes teaching to interpolate positions between the timings).
Claim 10
Regarding claim 10, Uchiyama et al. teach the tracking method according to claim 6, wherein the specifying the position of the head of the person in the three dimension further includes:
specifying trajectory information of the position of the head of the person in the three dimension for each window section including continuous image frames ("Symbols (frames) 1202, each representing the person area tracked in step S403, are overlapped on the camera image 1201. The frames of the same person in different camera images are colored with the same color, so that the user can recognize the same person in different camera images," paragraph [0079]): and
associating the trajectory information for each window section ("A person in the camera image 1201 and the same person in the 3D map 1204 are colored with the same color, so that the user can easily identify the same person in the camera image 1201 and the 3D map 1204," paragraph [0080]).
Claim 11
Regarding claim 11, Uchiyama et al. teach an information processing apparatus ("The object tracking unit 103 tracks the object, in the camera image, detected by the object detection unit 102," paragraph [0036]) comprising:
a memory ("The storage medium may include, for example, one or more of a hard disk, a random-access memory (RAM), a read only memory (ROM), a storage of distributed computing systems, an optical disk (such as a compact disc (CD), digital versatile disc (DVD), or Blu-ray Disc (BD)™), a flash memory device, a memory card, and the like," paragraph [0109]); and
a processor coupled to the memory ("one or more processors in the computers in the system or the device read out and execute the program," paragraph [0108]) and configured to:
specify a head region of a person from each of a plurality of images captured by a plurality of cameras ("a position of the head of a person, in the camera image, detected by the object detection unit 102," paragraph [0036]);
specify a set of head regions corresponding to a same person based on each of positions of the head regions specified from the plurality of images ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] where from each camera means there are a plurality of cameras and images); and
specify a position of a head of the person in a three dimension based on a position of the set of the head regions corresponding to the same person in a two dimension and parameters of the plurality of cameras ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] and "A fundamental matrix F including information on the positional relationship between the first and the second camera images can be calculated based on the positions, the orientations, and the intrinsic parameters of the cameras in the camera information stored in the camera information storage unit," paragraph [0059]).
Claim 12
Regarding claim 12, Uchiyama et al. teach the information processing apparatus according to claim 11, wherein the processor specifies whether, based on a distance between an epipolar line ("When a distance between the representative point of a person in the second camera image and the epipolar line 801 is equal to or smaller than a predetermined value, the person in the second camera image matches the person in the first camera image," paragraph [0060]), which is included in a first image and corresponds to a second head region included in a second image, and a first head region included in the first image, the first head region and the second head region correspond to the head region corresponding to the same person ("estimating the three dimensional position of a person in accordance with the position of the head of the person in the camera image obtained from each camera," paragraph [0050] where from each camera means there are a plurality of cameras and images).
Claim 13
Regarding claim 13, Uchiyama et al. teach the information processing apparatus according to claim 12, wherein the processor corrects the distance based on a size of the first head region and a size of the second head region ("In the present exemplary embodiment, the object attribute acquisition unit 104 acquires the information on the height of the object. Alternatively, the object attribute acquisition unit 104 may acquire information on the size of the object. [0039] The object attribute acquisition unit 104 includes a first position estimation unit 110," paragraph [0038] where a position estimation unit corrects distance, in this case using information on the size).
Claim 14
Regarding claim 14, Uchiyama et al. teach the information processing apparatus according to claim 11, wherein the processor estimates, based on the position of the head of the person in the three dimension which is specified based on the plurality of images captured by each of the plurality of cameras at a first timing and a third timing, the position of the head of the person in the three dimension at a second timing between the first timing and the second timing ("Thus, the matching person is tracked over a plurality of time points. As a result of the processing, a tracking label I is obtained for each person. The tracking label i is a code for identifying each tracked person," paragraph [0049] where tracking a person over time includes teaching to interpolate positions between the timings).
Claim 15
Regarding claim 15, Uchiyama et al. teach the information processing apparatus according to claim 11, wherein the processor specifies trajectory information of the position of the head of the person in the three dimension for each window section including continuous image frames ("Symbols (frames) 1202, each representing the person area tracked in step S403, are overlapped on the camera image 1201. The frames of the same person in different camera images are colored with the same color, so that the user can recognize the same person in different camera images," paragraph [0079]) and associate the trajectory information for each window section ("A person in the camera image 1201 and the same person in the 3D map 1204 are colored with the same color, so that the user can easily identify the same person in the camera image 1201 and the 3D map 1204," paragraph [0080]).
Reference Cited
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
US Patent Publication 2023 0326063 A1 to Oami discloses a detection unit (2020), a state estimation unit (2040), and a height estimation unit (2080). The detection unit (2020) detects a target person from a video frame. The state estimation unit (2040) estimates a state of the detected target person. The height estimation unit (2080) estimates a height of the person on the basis of a height of the target person in the video frame in a case where the estimated state satisfies a predetermined condition.
Non Patent Publication “Geometry-Based Multiple Camera Head Detection in Dense Crowds” to Pellicano et al. discloses head detection in crowded environments.
US Patent Publication 2012 0293667 A1 to Baba et al. discloses obtaining first images of a region of interest (ROI) to be imaged and associated with a first time, where the first images are associated with different positions and orientations with respect to the ROI. The method also includes defining an active region in the each of the first images and selecting intrinsic features in each of the first images based on the active region. Second, identifying a portion of the intrinsic features temporally and spatially matching intrinsic features in corresponding ones of second images of the ROI associated with a second time prior to the first time and computing three-dimensional (3D) coordinates for the portion of the intrinsic features. Finally, the method includes computing a relative pose for the first images based on the 3D coordinates.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HEATH E WELLS whose telephone number is (703)756-4696. The examiner can normally be reached Monday-Friday 8:00-4:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ms. Jennifer Mehmood can be reached on 571-272-2976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Heath E. Wells/Examiner, Art Unit 2664
Date: 14 January 2026