Office Action Analysis: 18721255 — INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND COMPUTER-READABLE MEDIUM

Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
1.	The information disclosure statement (IDS) submitted on 6/18/2024 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement has been considered by the examiner.

Claim Rejections - 35 USC § 103
2.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

3.	Claims 1-2, 5, 10, and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. (CN-110796079-A, hereinafter "Zhang") in view of Wang et al. (WO-2023/019699-A1, hereinafter "Wang").

4.	As per claim 1, Zhang discloses: An information processing apparatus comprising:
	at least one memory storing instructions and at least one processor configured to execute the instructions to: (Zhang, Abstract, “The invention discloses a multi-camera visitor identification method based on face depth features and human body local depth features. The multi-camera visitor identification method comprises the steps: S1, performing visitor reservation; S2, when a visitor visits, carrying out face recognition, and recording human body information; S3, performing posture skeleton detection on the human body information to obtain a human body image ... and S9, according to the identified target person, carrying out real-time track depiction by utilizing the camera ID.” and page 6, ¶ [0027], “This invention creates a three-dimensional, seamless, secure, and convenient customer visit identification system based on intelligent methods. The system mainly consists of a human body capture camera, a regular surveillance camera, and an algorithm GPU server. Visitor identification is mainly achieved by transmitting images captured by the camera to the algorithm GPU server for tracking and identification using algorithms.” and page 8, ¶ [0041], “In particular, the algorithm computation tasks in steps S1-S9 are all completed by the algorithm GPU server.”)
	[[generate,]] based on a first image taken of a person at a first angle of depression, [[a pseudo image corresponding to an image taken of the person at a second angle of depression different from the first angle of depression,]] in which transformation performed on an image region of a head part of the person is different from that performed on an image region of a torso part of the person; and (Zhang, page 5, ¶ [0015], “Preferably, in the above multi-camera visitor recognition method, in step S4, after obtaining the human body image, the human body is divided into attribute labels, and the human body image is divided into head patch, upper body patch and lower body patch. Then, the patches are subjected to affine transformation and normalization processing.” and page 5, ¶ [0011], “According to one aspect of the present invention, a multi-camera visitor recognition method ... comprising the following steps: S3 performing pose skeleton detection on the recorded human body information and obtaining a human body image; S4 extracting attribute labels and patch features from the obtained human body image; ... S6 repeating the operations of S3 and S4 sequentially on human body images transmitted back to the visitor recognition system from other ordinary surveillance cameras ...” and page 6, ¶ [0023], “This invention utilizes a combination of facial depth features and local body depth features for visitor identification, and adds multiple attribute labels to effectively constrain the identification process. This allows for identity recognition and authentication even when the camera is shooting from different angles, effectively avoiding the predicament of being unable to identify an individual due to the inability to capture a clear face.”; Examiner’s note: With the human body image dividing into a head region, upper body region, and a lower body region, each region is subjected to affine transformation independent of each other (and thus can be different).  In addition, step S4 is continually repeated resulting in the affine transformation being newly applied each time, potentially with different results.)
	create, [[based on the pseudo image,]] a recognition dictionary of the person [[whose image is taken at the second angle of depression.]] (Zhang, page 7, ¶ [0033]-[0034], “S1. First, visitor appointments are made, and the visitor's facial photo is uploaded to the visitor recognition system. Specifically, during the appointment process, visitors upload a photo of their face, as well as information about the company they are visiting and contact information. The visitor recognition system then forwards the information to the personnel of the company being visited for confirmation. After confirmation, the visitor's photo is uploaded to the whitelist of the visitor recognition system.”)

5.	Zhang doesn't explicitly disclose but Wang discloses: generate, [[based on a first image taken of a person at a first angle of depression,]] a pseudo image corresponding to an image taken of the person at a second angle of depression different from the first angle of depression, [[in which transformation performed on an image region of a head part of the person is different from that performed on an image region of a torso part of the person; and]]
	[[create,]] based on the pseudo image, [[a recognition dictionary of the person]] whose image is taken at the second angle of depression. (Wang, Fig. 4; Abstract, “A high-angle facial recognition method and system based on a 3D facial model, by means of which method and system recognition is performed by rotating a clear front-on face in a facial sample library such that the clear front-on face is at the same angle as a high-angle face to be recognized. The method comprises: firstly, collecting a clear front-on facial picture, constructing a facial sample library, and generating a 3D facial model from the front-on facial picture in the sample library; then, estimating, by using a facial posture estimation algorithm, the angle of a high-angle facial picture to be recognized, and rotating the 3D facial model such that same is at the same angle as the high-angle facial picture; and finally, inputting a generated high-angle face and a high-angle face to be recognized into a facial recognition network for recognition. By means of the method, for the problem of facial recognition in an actual overhead monitoring scenario, the high-angle face recognition accuracy is significantly improved.”)

6.	Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the information processing apparatus of Zhang to include the disclosure of generating a pseudo image corresponding to an image taken of the person at a second angle of depression different from the first angle of depression and creating a recognition dictionary based on the image, of Wang. The motivation for this modification could have been to generate other pseudo images of the head at different viewpoints and place them in the recognition dictionary.  This helps to “fill out” the dictionary with other possible viewpoints of the person so that they may be recognized from a different perspective.  In addition, generating pseudo images saves the person from submitting multiple identification photos from different viewpoints.

7.	As per claim 2, Zhang in view of Wang discloses: The information processing apparatus according to claim 1, wherein the at least one processor is further configured to execute the instructions to:
	perform projective transformation on the image region of the torso part so that a quadrilateral representing the torso part is transformed into another quadrilateral; and (Zhang, page 5, ¶ [0015], “Preferably, in the above multi-camera visitor recognition method, in step S4, after obtaining the human body image, the human body is divided into attribute labels, and the human body image is divided into head patch, upper body patch and lower body patch. Then, the patches are subjected to affine transformation and normalization processing.”;  Examiner’s note: By subjecting each divided body image region to affine transformations, a new quadrilateral replaces the old one.)
	perform transformation on the image region of the head part so that the viewpoint from which an image of the head part is taken is changed. (Wang, Fig. 4; Abstract, “A high-angle facial recognition method and system based on a 3D facial model, by means of which method and system recognition is performed by rotating a clear front-on face in a facial sample library such that the clear front-on face is at the same angle as a high-angle face to be recognized. The method comprises: firstly, collecting a clear front-on facial picture, constructing a facial sample library, and generating a 3D facial model from the front-on facial picture in the sample library; then, estimating, by using a facial posture estimation algorithm, the angle of a high-angle facial picture to be recognized, and rotating the 3D facial model such that same is at the same angle as the high-angle facial picture; and finally, inputting a generated high-angle face and a high-angle face to be recognized into a facial recognition network for recognition. By means of the method, for the problem of facial recognition in an actual overhead monitoring scenario, the high-angle face recognition accuracy is significantly improved.”)

8.	Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the information processing apparatus of claim 1 of Zhang to include the disclosure of performing a transformation on the image region of the head part so that the viewpoint from which an image of the head part is taken is changed, of Wang. The motivation for this modification could have been to generate another view of the head for populating the recognition dictionary.  By adding more viewpoints of the head, it may assist with the identification of the person.

9.	As per claim 5, Zhang in view of Wang discloses: The information processing apparatus according to claim 1, wherein the at least one processor is further configured to execute the instructions to:
	recognize features of a person appearing in the second image taken at the second angle of depression by referring to a recognition dictionary. (Zhang, page 6, ¶ [0028], “The present invention provides a visitor re-identification method based on the fusion of facial depth features and human body local depth feature blocks. It uses multiple cameras to identify and track visitors, and can use multiple cameras with non-overlapping perspectives deployed in the building to capture and identify visitors in real time, locate the visitor's position in the building, and delineate the visitor's route.” and page 6, ¶ [0021], “1. This invention employs an advanced intelligent method to transform the existing visitor system. The system uses the face as a unique identifier to mark other features, and combines facial depth features with local body depth features to achieve intelligent visitor recognition from multiple angles and cameras.”)

10.	Claim 10, which is similar in scope to independent claim 1, is thus rejected under the same rationale as described above. The motivation for this modification is the same as claim 1.

11.	Claim 12, which is similar in scope to independent claim 1, is thus rejected under the same rationale as described above. The motivation for this modification is the same as claim 1.

12.	Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. (CN-110796079-A, hereinafter "Zhang") in view of Wang et al. (WO-2023/019699-A1, hereinafter "Wang"), and further in view of Tussy (US-2016/0063235-A1).

13.	As per claim 3, Zhang in view of Wang discloses: The information processing apparatus according to claim 1, wherein the first image is an image taken with [[a fisheye camera or an ultra-wide-angle camera.]] (Zhang, page 7, ¶ [0034], “S2. When a visitor arrives, their face is captured and detected from a specific angle, and their body information is recorded.”)

14.	Zhang in view of Wang doesn't explicitly disclose but Tussy discloses: a fisheye camera or an ultra-wide-angle camera. (Tussy, [0009], “When an actual, three-dimensional person images himself or herself close up and far away, it has been found that the biometric results are different due to the fish-eye effect of the lens. Thus, a three-dimensional person may be validated when biometric results are different in the close-up and far away images. This also allows the user to have multiple biometric profiles for each of the distances.” and [0162], “The system may also incorporate other security features when the “zoom in” movement is used as shown in FIGS. 13A and 13B. Typical cameras on a mobile device or any other device include a curved lens. This results in a “fish-eye” effect in the resulting images taken by the camera.”)

15.	Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the information processing apparatus of claim 1 of Zhang in view of Wang to include the disclosure of utilizing a fisheye camera or an ultra-wide-angle camera to take a first image, of Tussy. The motivation for this modification could have been to utilize cameras that are commonly used for surveillance and account for their distortion.  This would mimic a real-life scenario.

16.	Claims 4 and 6-7 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. (CN-110796079-A, hereinafter "Zhang") in view of Wang et al. (WO-2023/019699-A1, hereinafter "Wang"), in view of Tussy (US-2016/0063235-A1), and further in view of Matsuda et al. (US-2016/0283590-A1, hereinafter "Matsuda").

17.	As per claim 4, Zhang in view of Wang, and further in view of Tussy discloses: The information processing apparatus according to claim 3, wherein the at least one processor is further configured to execute the instructions to:
	[[designate a predetermined proportion of the first image from an upper edge of the first image as an image region of the head part,]] and designate a region other than the region of the head part in the first image as the image region of the torso part in the first image. (Zhang, page 5, ¶ [0015], “Preferably, in the above multi-camera visitor recognition method, in step S4, after obtaining the human body image, the human body is divided into attribute labels, and the human body image is divided into head patch, upper body patch and lower body patch. Then, the patches are subjected to affine transformation and normalization processing.”)

18.	Zhang in view of Wang, and further in view of Tussy doesn't explicitly disclose but Matsuda discloses: designate a predetermined proportion of the first image from an upper edge of the first image as an image region of the head part, [[and designate a region other than the region of the head part in the first image as the image region of the torso part in the first image.]] (Matsuda, Fig. 3A; [0009], “According to an aspect of the invention, a search system includes circuitry configured to detect an first object to be a search target, from information of an image that is captured by an imaging device, ... divide an image region corresponding to the first object into at least two image regions based on a dividing ratio that is obtained by correcting a predetermined ratio in accordance with the parameter ...” and [0046], “FIGS. 3A and 3B illustrate examples in which person regions in the images illustrated in FIGS. 1A and 1B are divided at specific proportions. In FIGS. 3A and 3B, the horizontal direction is represented by an x coordinate, and the height direction is represented by a y coordinate. FIG. 3A illustrates an example in which a person region 13 of the person 10 in the image 1 illustrated in FIG. 1A is vertically divided at a height d1. The height d1 is calculated by multiplying the length (height) of the person region 13 in the y direction by a specific ratio. In such a case, the person region 13 in the image 1 is divided into an upper region 14 and a lower region 15 at a position corresponding to the height d1.” and [0053], “FIG. 5C is an enlarged view of a person region that corresponds to the person in FIG. 5A. FIG. 5D is an enlarged view of a person region that corresponds to the person in FIG. 5B. In FIGS. 5C and 5D, the horizontal direction is represented by an x coordinate, and the height direction is represented by a y coordinate in the same manner as in FIGS. 5A and 5B. In FIGS. 5C and 5D, person regions are normalized with respect to the height direction to cause the lengths of the person regions in the y direction to be the same for comparison.”)

19.	Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the information processing apparatus of claim 3 of Zhang in view of Wang, and further in view of Tussy to include the disclosure of designating a predetermined proportion of a first image from an upper edge as an image region of the head part, of Matsuda. The motivation for this modification could have been to specify the parts of the image for a person’s head and torso for quick identification.  Specifically, this allows both the head and torso to be separately processed so that if there is a clear image of the person’s face, no further processing is needed.
20.	As per claim 6, Zhang in view of Wang, further in view of Tussy, and further in view of Matsuda discloses: The information processing apparatus according to claim 5, wherein a height of a first image capturing apparatus for taking the first image is different from that of a second image capturing apparatus for taking the second image. (Matsuda, [0009], “According to an aspect of the invention, a search system includes circuitry configured to detect an first object to be a search target, from information of an image that is captured by an imaging device, determine a parameter in consideration of how the first object is viewed in the image, in accordance with a height at which the imaging device is installed, a depression angle of the imaging device ...” and [0075], “The dividing information storage unit 112 stores dividing information that is referred to when a dividing position is determined. A detailed description will be given later. If a plurality of cameras 200 are present, dividing information is stored for each camera 200 with different installation conditions (heights and depression angles).”)

21.	Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the information processing apparatus of claim 5 of Zhang in view of Wang, further in view of Tussy to include the disclosure of image capturing apparatuses at different heights, of Matsuda. The motivation for this modification could have been to reflect real-life scenarios where cameras on buildings, walls, street poles, etc. are all at different heights from each other.  By utilizing this scenario, it makes the functionality of the information processing apparatus more robust at handling image capture and recognition.

22.	As per claim 7, Zhang in view of Wang, further in view of Tussy, and further in view of Matsuda discloses: The information processing apparatus according to claim 6, wherein the at least one processor is further configured to execute the instructions to:
	create a recognition dictionary for each horizontal distance between the second image capturing apparatus and the person. (Tussy, [0009], “In one embodiment, the user may enroll in the system by providing enrollment images of the user's face. The enrollment images are taken by the camera of the mobile device as the user moves the mobile device to different positions relative to the user's head. The user may thus obtain enrollment images showing the user's face from different angles and distances.” and [0019], “When an actual, three-dimensional person images himself or herself close up and far away, it has been found that the biometric results are different due to the fish-eye effect of the lens. Thus, a three-dimensional person may be validated when biometric results are different in the close-up and far away images. This also allows the user to have multiple biometric profiles for each of the distances.” and [0068], “One or more cameras (still, video, or both) 276 are provided to capture image data for storage in the memory 210 and/or for possible transmission over a wireless or wired link or for viewing at a later time.”; Examiner’s note: All distances between a camera and a person have a horizontal and vertical distance component (X,Y).  A person can also be captured via a second camera.)

23.	Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the information processing apparatus of claim 6 of Zhang in view of Wang, and further in view of Matsuda to include the disclosure of creating a recognition dictionary for each horizontal distance between the second image capturing apparatus and the person, of Tussy. The motivation for this modification could have been to store additional images of a person in the recognition dictionary depending on their distance away from the camera.  If the camera has some lens distortion, storing additional images may help add identification if the features of the person appear similar at various distances from the camera.


Conclusion
24.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to MATTHEW CLOTHIER whose telephone number is (571)272-4667. The examiner can normally be reached Mon-Fri 8:00am-4:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kent Chang can be reached at (571)272-7667. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/MATTHEW CLOTHIER/Examiner, Art Unit 2614                                                                                                                                                                                                        
/KENT W CHANG/Supervisory Patent Examiner, Art Unit 2614
Read full office action
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND COMPUTER-READABLE MEDIUM

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND COMPUTER-READABLE MEDIUM

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email