Last updated: April 19, 2026
Application No. 18/603,752
COMPUTER-READABLE RECORDING MEDIUM STORING REGION DETECTION PROGRAM, APPARATUS, AND METHOD

Non-Final OA §102§112
Filed
Mar 13, 2024
Examiner
ZUBERI, MOHAMMED H
Art Unit
2178
Tech Center
2100 — Computer Architecture & Software
Assignee
Fujitsu Limited
OA Round
1 (Non-Final)
Interview Optional

— +27.8% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 437 resolved cases, 2023–2026
Examiner Intelligence

ZUBERI, MOHAMMED H View full profile →
Grants 70% — above average
Career Allow Rate
306 granted / 437 resolved
+15.0% vs TC avg
Strong +28% interview lift
Without
With
+27.8%
Interview Lift
resolved cases with interview
Typical timeline
3y 1m
Avg Prosecution
23 currently pending
Career history
460
Total Applications
across all art units
Statute-Specific Performance

§101
11.3%
-28.7% vs TC avg
§103
53.6%
+13.6% vs TC avg
§102
20.8%
-19.2% vs TC avg
§112
12.7%
-27.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 437 resolved cases
Office Action

§102 §112
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This action is responsive to patent application as filed on 3/13/2024 which is a CON of PCT/JP21/37958 filed 10/13/2021.
This action is made Non-Final.

	Claims 1 – 19 are pending in the case. Claims 1, 8, and 15 are independent claims. 

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 11/12/2024 and 3/13/2024, is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner. 

Drawings
	The drawings filed on 3/13/2024 have been accepted by the Examiner.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 3-6 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 3 recites the limitation "the second program”.  There is insufficient antecedent basis for this limitation in the claim. Appropriate correction is required. Dependent claims 4-6 are rejected for being dependent upon a rejected base claim.


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-4, 7-11 and 14-18 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Uchiyama (USPUB 20160292533 A1) from IDS filed 11/12/2024.

Claim 1:
Uchiyama discloses A non-transitory computer-readable recording medium storing a region detection program  (0202) for causing a computer to execute a process comprising: acquiring images each which is captured by each of a plurality of imaging apparatuses that capture the respective images of a person from respective different directions (0054: the image acquisition unit 101 acquires frame images from the respective cameras. In the present exemplary embodiment, since the number of cameras is four, four images are acquired); detecting a region indicating the person from each of the images by inputting the images to a machine learning model which is generated in advance by a machine learning so as to detect the region indicating the person (0055 and 0115: the object information acquisition unit 102 extracts human body areas from the respective images acquired in step S402. As a result of the processing, the coordinates (x, y) of representative points and the heights h and widths w of rectangles representing the human body areas are acquired... a classifier called a multiclass classifier configured to classify a plurality of types of classes may be used to simultaneously perform the human body detection and the direction classification. For example, a classifier configured to classify a non-human body (background), a human body of a direction 1, a human body of a direction 2, . . . , and a human body of a direction N is prepared. The classifier is applied to a partial image acquired from a camera image by a sliding window protocol to determine whether the partial image is the non-human body (background), the human body of the direction 1, the human body of the direction 2, or the human body of the direction N). Use of the same classifier to perform the object detection and the object attribute extraction is expected to exert an effect of decreasing the amount of processing, compared to the case where different classifiers are used to perform the object detection and the object attribute extraction); and interpolating, based on a first region of the person which is detected from a first image of the images and a parameter of each of the plurality of imaging apparatuses, a second region indicating the person in a second image of the images (0057 and 0098: the object associating unit 105 associates the human bodies detected in step S403 between the images. Specifically, the object associating unit 105 performs a search to find out a correspondence between a human body detected in an image and a human body detected in another image... the three-dimensional position estimation is performed using the positional relationship of the cameras and the positions and geometric attributes (size of human body) of the plurality of objects so that an effect of eliminating a virtual image is expected to be exerted. It can especially be expected to exert an effect of efficiently eliminating a virtual image existing at a wrong distance from the cameras).


Claim 2:
Uchiyama discloses the first image is an image in which the region indicating the person is detected by the machine learning model in the images, and the second image is an image in which the region indicating the person is not detected by the machine learning model in the images (0044-0049: The detection complementation unit 108 complements an object failed to be detected by the object information acquisition unit 102, based on a result of the three-dimensional position estimation unit).

Claim 3:
(EXAMINER’S INTERPRETATION: In an effort to advance prosecution, the Examiner interprets “the second program” to be a typographical error which was meant to read “the second region”).
Uchiyama discloses the plurality of imaging apparatuses are arranged in a same horizontal plane, and a width of the second program is estimated based on a height of the first region, a height of the second region, and statistical information regarding a posture of the person (0058-61: The human bodies are associated between the images by associating the representative points of the human body areas using epipolar geometry... a straight line connecting a representative point of a human body A on a camera image 1 captured by the camera 1 (501) to an optical center (not illustrated) of the camera 1 is represented by a straight line 503 called an epipolar line on a camera image 2 (502). The epipolar line refers to a line where an epipolar plane and the camera image 1 (image plane of camera 1) intersect and a line where an epipolar plane and the camera image 2 (image plane of camera 2) intersect. The epipolar plane refers to a plane passing through three points P, C1, and C2, where P is a point in a three-dimensional space that corresponds to the representative point of the human body A on the camera image 1 (501), C1 is the optical center of the camera 1, and C2 is an optical center of the camera 2... A fundamental matrix that is a matrix containing information about the positional relationship between the camera images 1 (501) and 2 (502) that is acquired based on the positions, orientations, and intrinsic parameters of the cameras is denoted by F. Further, a vector representing the two-dimensional coordinates of the human body A is denoted by x).

Claim 4:
Uchiyama discloses wherein the height of the second region is estimated by converting an end point of a vertical center line of the first region into a coordinate of an end point of a vertical center line of the person in a three dimensional space based on the parameter of an imaging apparatus which captures the first image and converting the coordinate into a coordinate in the second image based on the parameter of an imaging apparatus which captures the second image (0058-61: The human bodies are associated between the images by associating the representative points of the human body areas using epipolar geometry... a straight line connecting a representative point of a human body A on a camera image 1 captured by the camera 1 (501) to an optical center (not illustrated) of the camera 1 is represented by a straight line 503 called an epipolar line on a camera image 2 (502). The epipolar line refers to a line where an epipolar plane and the camera image 1 (image plane of camera 1) intersect and a line where an epipolar plane and the camera image 2 (image plane of camera 2) intersect. The epipolar plane refers to a plane passing through three points P, C1, and C2, where P is a point in a three-dimensional space that corresponds to the representative point of the human body A on the camera image 1 (501), C1 is the optical center of the camera 1, and C2 is an optical center of the camera 2... A fundamental matrix that is a matrix containing information about the positional relationship between the camera images 1 (501) and 2 (502) that is acquired based on the positions, orientations, and intrinsic parameters of the cameras is denoted by F. Further, a vector representing the two-dimensional coordinates of the human body A is denoted by x).


Claim 7:
Uchiyama discloses the plurality of imaging apparatuses are arranged in a same vertical plane, and a height of the second region is estimated based on a width of the first region, a width of the second region, and statistical information regarding a posture of the person (0058-61: The human bodies are associated between the images by associating the representative points of the human body areas using epipolar geometry... a straight line connecting a representative point of a human body A on a camera image 1 captured by the camera 1 (501) to an optical center (not illustrated) of the camera 1 is represented by a straight line 503 called an epipolar line on a camera image 2 (502). The epipolar line refers to a line where an epipolar plane and the camera image 1 (image plane of camera 1) intersect and a line where an epipolar plane and the camera image 2 (image plane of camera 2) intersect. The epipolar plane refers to a plane passing through three points P, C1, and C2, where P is a point in a three-dimensional space that corresponds to the representative point of the human body A on the camera image 1 (501), C1 is the optical center of the camera 1, and C2 is an optical center of the camera 2... A fundamental matrix that is a matrix containing information about the positional relationship between the camera images 1 (501) and 2 (502) that is acquired based on the positions, orientations, and intrinsic parameters of the cameras is denoted by F. Further, a vector representing the two-dimensional coordinates of the human body A is denoted by x; the arrangement of cameras being in the same vertical plane is a design choice obvious over the teachings of Uchiyama).


Claim 8:
Uchiyama discloses A region detection apparatus comprising: a memory; and a processor coupled to the memory (0202) and configured to: acquire images each which is captured by each of a plurality of imaging apparatuses that capture the respective images of a person from respective different directions (0054: the image acquisition unit 101 acquires frame images from the respective cameras. In the present exemplary embodiment, since the number of cameras is four, four images are acquired); detect a region indicating the person from each of the images by inputting the images to a machine learning model which is generated in advance by a machine learning so as to detect the region indicating the person (0055 and 0115: the object information acquisition unit 102 extracts human body areas from the respective images acquired in step S402. As a result of the processing, the coordinates (x, y) of representative points and the heights h and widths w of rectangles representing the human body areas are acquired... a classifier called a multiclass classifier configured to classify a plurality of types of classes may be used to simultaneously perform the human body detection and the direction classification. For example, a classifier configured to classify a non-human body (background), a human body of a direction 1, a human body of a direction 2, . . . , and a human body of a direction N is prepared. The classifier is applied to a partial image acquired from a camera image by a sliding window protocol to determine whether the partial image is the non-human body (background), the human body of the direction 1, the human body of the direction 2, or the human body of the direction N). Use of the same classifier to perform the object detection and the object attribute extraction is expected to exert an effect of decreasing the amount of processing, compared to the case where different classifiers are used to perform the object detection and the object attribute extraction); and interpolate, based on a first region of the person which is detected from a first image of the images and a parameter of each of the plurality of imaging apparatuses, a second region indicating the person in a second image of the images (0057 and 0098: the object associating unit 105 associates the human bodies detected in step S403 between the images. Specifically, the object associating unit 105 performs a search to find out a correspondence between a human body detected in an image and a human body detected in another image... the three-dimensional position estimation is performed using the positional relationship of the cameras and the positions and geometric attributes (size of human body) of the plurality of objects so that an effect of eliminating a virtual image is expected to be exerted. It can especially be expected to exert an effect of efficiently eliminating a virtual image existing at a wrong distance from the cameras).

Claim 9:
Uchiyama discloses the first image is an image in which the region indicating the person is detected by the machine learning model in the images, and the second image is an image in which the region indicating the person is not detected by the machine learning model in the images (0044-0049: The detection complementation unit 108 complements an object failed to be detected by the object information acquisition unit 102, based on a result of the three-dimensional position estimation unit).

Claim 10:
Uchiyama discloses the plurality of imaging apparatuses are arranged in a same horizontal plane, and a width of the second program is estimated based on a height of the first region, a height of the second region, and statistical information regarding a posture of the person (0058-61: The human bodies are associated between the images by associating the representative points of the human body areas using epipolar geometry... a straight line connecting a representative point of a human body A on a camera image 1 captured by the camera 1 (501) to an optical center (not illustrated) of the camera 1 is represented by a straight line 503 called an epipolar line on a camera image 2 (502). The epipolar line refers to a line where an epipolar plane and the camera image 1 (image plane of camera 1) intersect and a line where an epipolar plane and the camera image 2 (image plane of camera 2) intersect. The epipolar plane refers to a plane passing through three points P, C1, and C2, where P is a point in a three-dimensional space that corresponds to the representative point of the human body A on the camera image 1 (501), C1 is the optical center of the camera 1, and C2 is an optical center of the camera 2... A fundamental matrix that is a matrix containing information about the positional relationship between the camera images 1 (501) and 2 (502) that is acquired based on the positions, orientations, and intrinsic parameters of the cameras is denoted by F. Further, a vector representing the two-dimensional coordinates of the human body A is denoted by x).

Claim 11:
Uchiyama discloses wherein the height of the second region is estimated by converting an end point of a vertical center line of the first region into a coordinate of an end point of a vertical center line of the person in a three dimensional space based on the parameter of an imaging apparatus which captures the first image and converting the coordinate into a coordinate in the second image based on the parameter of an imaging apparatus which captures the second image (0058-61: The human bodies are associated between the images by associating the representative points of the human body areas using epipolar geometry... a straight line connecting a representative point of a human body A on a camera image 1 captured by the camera 1 (501) to an optical center (not illustrated) of the camera 1 is represented by a straight line 503 called an epipolar line on a camera image 2 (502). The epipolar line refers to a line where an epipolar plane and the camera image 1 (image plane of camera 1) intersect and a line where an epipolar plane and the camera image 2 (image plane of camera 2) intersect. The epipolar plane refers to a plane passing through three points P, C1, and C2, where P is a point in a three-dimensional space that corresponds to the representative point of the human body A on the camera image 1 (501), C1 is the optical center of the camera 1, and C2 is an optical center of the camera 2... A fundamental matrix that is a matrix containing information about the positional relationship between the camera images 1 (501) and 2 (502) that is acquired based on the positions, orientations, and intrinsic parameters of the cameras is denoted by F. Further, a vector representing the two-dimensional coordinates of the human body A is denoted by x).

Claim 14:
Uchiyama discloses the plurality of imaging apparatuses are arranged in a same vertical plane, and a height of the second region is estimated based on a width of the first region, a width of the second region, and statistical information regarding a posture of the person (0058-61: The human bodies are associated between the images by associating the representative points of the human body areas using epipolar geometry... a straight line connecting a representative point of a human body A on a camera image 1 captured by the camera 1 (501) to an optical center (not illustrated) of the camera 1 is represented by a straight line 503 called an epipolar line on a camera image 2 (502). The epipolar line refers to a line where an epipolar plane and the camera image 1 (image plane of camera 1) intersect and a line where an epipolar plane and the camera image 2 (image plane of camera 2) intersect. The epipolar plane refers to a plane passing through three points P, C1, and C2, where P is a point in a three-dimensional space that corresponds to the representative point of the human body A on the camera image 1 (501), C1 is the optical center of the camera 1, and C2 is an optical center of the camera 2... A fundamental matrix that is a matrix containing information about the positional relationship between the camera images 1 (501) and 2 (502) that is acquired based on the positions, orientations, and intrinsic parameters of the cameras is denoted by F. Further, a vector representing the two-dimensional coordinates of the human body A is denoted by x; the arrangement of cameras being in the same vertical plane is a design choice obvious over the teachings of Uchiyama).


Claim 15:
Uchiyama discloses A region detection method for executing a process comprising: acquiring images each which is captured by each of a plurality of imaging apparatuses that capture the respective images of a person from respective different directions (0054: the image acquisition unit 101 acquires frame images from the respective cameras. In the present exemplary embodiment, since the number of cameras is four, four images are acquired); detecting a region indicating the person from each of the images by inputting the images to a machine learning model which is generated in advance by a machine learning so as to detect the region indicating the person (0055 and 0115: the object information acquisition unit 102 extracts human body areas from the respective images acquired in step S402. As a result of the processing, the coordinates (x, y) of representative points and the heights h and widths w of rectangles representing the human body areas are acquired... a classifier called a multiclass classifier configured to classify a plurality of types of classes may be used to simultaneously perform the human body detection and the direction classification. For example, a classifier configured to classify a non-human body (background), a human body of a direction 1, a human body of a direction 2, . . . , and a human body of a direction N is prepared. The classifier is applied to a partial image acquired from a camera image by a sliding window protocol to determine whether the partial image is the non-human body (background), the human body of the direction 1, the human body of the direction 2, or the human body of the direction N). Use of the same classifier to perform the object detection and the object attribute extraction is expected to exert an effect of decreasing the amount of processing, compared to the case where different classifiers are used to perform the object detection and the object attribute extraction); and interpolating, based on a first region of the person which is detected from a first image of the images and a parameter of each of the plurality of imaging apparatuses, a second region indicating the person in a second image of the images (0057 and 0098: the object associating unit 105 associates the human bodies detected in step S403 between the images. Specifically, the object associating unit 105 performs a search to find out a correspondence between a human body detected in an image and a human body detected in another image... the three-dimensional position estimation is performed using the positional relationship of the cameras and the positions and geometric attributes (size of human body) of the plurality of objects so that an effect of eliminating a virtual image is expected to be exerted. It can especially be expected to exert an effect of efficiently eliminating a virtual image existing at a wrong distance from the cameras).

Claim 16:
Uchiyama discloses the first image is an image in which the region indicating the person is detected by the machine learning model in the images, and the second image is an image in which the region indicating the person is not detected by the machine learning model in the images (0044-0049: The detection complementation unit 108 complements an object failed to be detected by the object information acquisition unit 102, based on a result of the three-dimensional position estimation unit).

Claim 17:
Uchiyama discloses the plurality of imaging apparatuses are arranged in a same horizontal plane, and a width of the second program is estimated based on a height of the first region, a height of the second region, and statistical information regarding a posture of the person (0058-61: The human bodies are associated between the images by associating the representative points of the human body areas using epipolar geometry... a straight line connecting a representative point of a human body A on a camera image 1 captured by the camera 1 (501) to an optical center (not illustrated) of the camera 1 is represented by a straight line 503 called an epipolar line on a camera image 2 (502). The epipolar line refers to a line where an epipolar plane and the camera image 1 (image plane of camera 1) intersect and a line where an epipolar plane and the camera image 2 (image plane of camera 2) intersect. The epipolar plane refers to a plane passing through three points P, C1, and C2, where P is a point in a three-dimensional space that corresponds to the representative point of the human body A on the camera image 1 (501), C1 is the optical center of the camera 1, and C2 is an optical center of the camera 2... A fundamental matrix that is a matrix containing information about the positional relationship between the camera images 1 (501) and 2 (502) that is acquired based on the positions, orientations, and intrinsic parameters of the cameras is denoted by F. Further, a vector representing the two-dimensional coordinates of the human body A is denoted by x).

Claim 18:
Uchiyama discloses wherein the height of the second region is estimated by converting an end point of a vertical center line of the first region into a coordinate of an end point of a vertical center line of the person in a three dimensional space based on the parameter of an imaging apparatus which captures the first image and converting the coordinate into a coordinate in the second image based on the parameter of an imaging apparatus which captures the second image (0058-61: The human bodies are associated between the images by associating the representative points of the human body areas using epipolar geometry... a straight line connecting a representative point of a human body A on a camera image 1 captured by the camera 1 (501) to an optical center (not illustrated) of the camera 1 is represented by a straight line 503 called an epipolar line on a camera image 2 (502). The epipolar line refers to a line where an epipolar plane and the camera image 1 (image plane of camera 1) intersect and a line where an epipolar plane and the camera image 2 (image plane of camera 2) intersect. The epipolar plane refers to a plane passing through three points P, C1, and C2, where P is a point in a three-dimensional space that corresponds to the representative point of the human body A on the camera image 1 (501), C1 is the optical center of the camera 1, and C2 is an optical center of the camera 2... A fundamental matrix that is a matrix containing information about the positional relationship between the camera images 1 (501) and 2 (502) that is acquired based on the positions, orientations, and intrinsic parameters of the cameras is denoted by F. Further, a vector representing the two-dimensional coordinates of the human body A is denoted by x).

Allowable Subject Matter
Claims 12, 13 and 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Note
    The Examiner cites particular columns, line numbers and/or paragraph numbers in the references as applied to the claims below for the convenience of the Applicant(s). Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the Applicant fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner. See MPEP 2123.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure and is listed in the attached PTOL-892 form.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMED-IBRAHIM ZUBERI whose telephone number is (571)270-7761.  The examiner can normally be reached on M-Th 8-6 Fri: 7-12/OFF.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Steph Hong can be reached on (571) 272-4124.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/MOHAMMED H ZUBERI/Primary Examiner, Art Unit 2178
Read full office action
Prosecution Timeline

Mar 13, 2024
Application Filed
Jan 06, 2026
Non-Final Rejection — §102, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/817,552
Patent 12585923
DESPARSIFIED CONVOLUTION FOR SPARSE ACTIVATIONS
2y 5m to grant Granted Mar 24, 2026
18/259,854
Patent 12582478
SYSTEMS AND METHODS FOR INTEGRATING INTRAOPERATIVE IMAGE DATA WITH MINIMALLY INVASIVE MEDICAL TECHNIQUES
2y 5m to grant Granted Mar 24, 2026
18/501,375
Patent 12579650
IMPROVED SPINAL HARDWARE RENDERING
2y 5m to grant Granted Mar 17, 2026
18/077,540
Patent 12567496
METHOD AND APPARATUS FOR DISPLAYING AND ANALYSING MEDICAL SCAN IMAGES
2y 5m to grant Granted Mar 03, 2026
18/755,362
Patent 12547819
MODULAR SYSTEMS AND METHODS FOR SELECTIVELY ENABLING CLOUD-BASED ASSISTIVE TECHNOLOGIES
2y 5m to grant Granted Feb 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
70%
Grant Probability
98%
With Interview (+27.8%)
3y 1m
Median Time to Grant
Low
PTA Risk
Based on 437 resolved cases by this examiner. Grant probability derived from career allow rate.