Office Action Analysis: 18527549 — OBJECT DETECTION DEVICE

Office Action

§102 §112
DETAILED ACTION
Notice of AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Receipt is acknowledged of certified copies of papers submitted under 35 U.S.C. 119(a)-(d), which papers have been placed of record in the file.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 12/04/2023, 04/17/2024, 01/31/2025, 02/24/2025, 04/11/2025, 08/11/2025, 10/21/2025, and 11/21/2025 are being considered by the examiner.

Specification Objections
The specification is objected to because of the following informalities:
In Page 5, line 12, “a COMS image sensor…” should read “a CMOS image sensor…” Appropriate correction is required. 

Claim Objections
Claim 5 and 6 are objected to because of the following informalities:
In claim 5, line 2, the term “a NMS processing unit” should be changed to “a Non-Maximum Suppression (NMS) processing unit” in order to avoid typographical issue. 
In claim 6, line 17, the term “the low-confidence rectangular,” should be changed to “the low-confidence rectangular frame,” in order to avoid typographical issue. 
Appropriate correction is required.
Applicant is advised that should claim 1 be found allowable, claim 7 will be objected to under 37 CFR 1.75 as being a substantial duplicate thereof. When two claims in an application are duplicates or else are so close in content that they both cover the same thing, despite a slight difference in wording, it is proper after allowing one claim to object to the other as being a substantial duplicate of the allowed claim. See MPEP § 608.01(m).

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that use the word “means” or “step” but are nonetheless not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph because the claim limitation(s) recite(s) sufficient structure, materials, or acts to entirely perform the recited function.  

Claims 1-7 recite limitations that use words like “means” (or “step”) or similar terms with functional language and do invoke 35 U.S.C. 112(f):
Claim 1; recites the limitation, “an acquisition unit configured to acquire…” [Line 9].
Claim 1; recites the limitation, “a providing unit configured to provide…” [Line 12].
Claim 1; recites the limitation, “a calculation unit configured to perform calculation…” [Line 15].
Claim 1; recites the limitation, “an addition unit configured to perform addition…” [Line 19].
Claim 1; recites the limitation, “a deletion unit configured to perform deletion…” [Line 22].
Claim 1; recites the limitation, “an output unit configured to output…” [Line 25].
Claim 2; recites the limitation, “the calculation unit calculates…” [Line 2].
Claim 3; recites the limitation, “the calculation unit uses…” [Line 2].
Claim 4; recites the limitation, “a motion detection unit that detects…” [Lines 2-3].
Claim 4; recites the limitation, “the calculation unit adjusts…” [Line 4].
Claim 5; recites the limitation, “a NMS processing unit configured to perform…” [Line 2].
Claim 5; recites the limitation, “the calculation unit calculates…” [Line 9].
Claim 6; recites the limitation, “the addition unit counts…” [Line 18].
Claim 6; recites the limitation, “the addition unit reduces…” [Line 20].
Claim 7; recites the limitation, “an acquisition unit acquires…” [Line 7]. 
Claim 7; recites the limitation, “a providing unit provides…” [line 10].
Claim 7; recites the limitation, “a calculation unit performs calculation…” [Line 13].
Claim 7; recites the limitation, “an addition unit performs addition …” [Line 16].
Claim 7; recites the limitation, “a deletion unit performs deletion…” [Line 19].
Claim 7; recites the limitation, “an output unit outputs…” [Line 22].
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
After a careful analysis, as disclosed above, and a careful review of the specification the following limitations in claims 1-7
“acquisition unit” (Fig. 2, #17. Page 11, Lines [10-13] - the object detection device 17 is an acquisition unit that acquires the detection target image IMc. [Page 5, Lines 24-32 and Page 6, Lines 1-2] the object detection device 17 includes a processor 17a, and a memory 17b.  The processor 17a is a CPU, a GPU, or a DSP, for example.  The memory 17b includes a RAM and a ROM.  The memory 17b stores program codes or commands configured to cause the processor 17a to execute processes.  The memory 17b, that is, a computer readable medium, includes any available medium that is accessible by a general-purpose computer or a dedicated computer.  The object detection device 17 may include a hardware circuit such as an ASIC or an FPGA.  The object detection device 17, which is a process circuit, may include one or more processors that operates in accordance with computer programs, one or more hardware circuits such as the ASIC or the FPGA, or a combination thereof (Wherein the acquisition unit does have a sufficient structure associated with it, hardware circuit.).)
“providing unit” ([Page 7, Lines 10-14] As shown in FIG. 4, at Step S12, the object detection device 17 provides a plurality of rectangular frames A in the image IM. Thus, the object detection device 17 includes a providing unit that provides a plurality of rectangular frames A in the image IM. The rectangular frames A correspond to candidate areas for the area where the target object T is present in the image IM. (Wherein the providing unit does not have sufficient structure associated with it).)
“calculation unit” ([Page 13, Lines 17-20]-the object detection device 17 includes a calculation unit that calculates the similarity score indicating the index of the degree of similarity between the bounding box B output to the comparison image IMp and the rectangular frame A in the detection target image IMc. [Page 14, Lines 6-9]-when the similarity score is R, R is expressed by the following equation (2). R = Rdα× Raβ× Rsγ…(2) (Wherein the calculation unit does not have sufficient structure associated with it.).)
“addition unit” ([Page 16, Lines 26-32 and Page 17, Line 1]-As shown in FIG. 9, at Step S25, the object detection device 17 performs addition processing to increase the confidence score of the rectangular frame A having a similarity score that satisfies a predetermined condition, among the plurality of rectangular frames A. That is, the object detection device 17 increases the confidence score of the rectangular frame A based on the similarity score. Therefore, the object detection device 17 includes an addition unit that performs the addition processing to increase the confidence score of the rectangular frame A whose similarity score satisfies the predetermined condition. (Wherein the addition unit does not have sufficient structure associated with it).)
“deletion unit” ([Page 8, Lines 13-19]-As shown in FIG. 4, at Step S13, the object detection device 17 performs deletion processing to delete the rectangular frames A having a confidence score less than a confidence score threshold from the plurality of rectangular frames A provided to the image IM. Therefore, the object detection device 17 includes a deletion unit that performs the deletion processing to delete the rectangular frames A having a confidence score less than the confidence score threshold. The number of the rectangular frames A is reduced by Step S13. (Wherein the deletion unit does not have sufficient structure associated with it).)
“output unit” ([Page 9, Lines 31-32 and Page 10, Lines 1-4]-As shown in FIG. 4, at Step S15, the object detection device 17 outputs the rectangular frame A remaining after the NMS processing as a bounding box B indicating the area where the target object T is present. That is, the object detection device 17 includes an output unit configured to output a rectangular frame remaining after the deletion processing as the bounding box for the comparison image. (Wherein the output unit does not have sufficient structure associated with it).)
“motion detection unit” ([Page 24, Line 31-32 and Page 25, Line 1]-The object detection device 17 may detect a motion of the forklift truck 10. That is, the object detection device 17 may include a motion detection unit that detects the motion of the forklift truck 10. The object detection device 17 may detect the motion of the forklift truck 10, for example, based on information input from the forklift truck 10. Examples of the information input from the forklift truck 10 include a measurement result by an inertial measurement unit (IMU), a steering wheel operation by an operator of the forklift truck 10, and an operation amount of an accelerator pedal by the operator of the forklift truck 10. Furthermore, the object detection device 17 may detect the motion of the forklift truck 10 by performing processing such as optical flow on the acquired image IM. Then, the object detection device 17 adjusts α, β, γ according to the detected motion of the forklift truck 10. That is, the object detection device 17 adjusts the importance of the position score, the aspect ratio score, and the area score according to the operation of the forklift truck 10. (Wherein the motion detection unit does not have sufficient structure associated with it).)
“NMS processing unit” ([Page 8, Lines 27-32]-As shown in FIG. 4, at Step S14, the object detection device 17 performs NMS (Non Maximum Suppression) processing when the plurality of rectangular frames A remain after the deletion processing at Step S13. Therefore, the object detection device 17 has an NMS processing unit that performs the NMS processing. The number of rectangular frames A is further reduced by Step S14. (Wherein the NMS processing unit does not have sufficient structure associated with it).)

If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-7 along with their dependent claims are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
Claim 1; recites the limitation, “a providing unit configured to provide…” [Line 12].
Claim 1; recites the limitation, “a calculation unit configured to perform calculation…” [Line 15].
Claim 1; recites the limitation, “an addition unit configured to perform addition…” [Line 19].
Claim 1; recites the limitation, “a deletion unit configured to perform deletion…” [Line 22].
Claim 1; recites the limitation, “an output unit configured to output…” [Line 25].
Claim 2; recites the limitation, “the calculation unit calculates…” [Line 2].
Claim 3; recites the limitation, “the calculation unit uses…” [Line 2].
Claim 4; recites the limitation, “a motion detection unit that detects…” [Lines 2-3].
Claim 4; recites the limitation, “the calculation unit adjusts…” [Line 4].
Claim 5; recites the limitation, “a NMS processing unit configured to perform…” [Line 2].
Claim 5; recites the limitation, “the calculation unit calculates…” [Line 9].
Claim 6; recites the limitation, “the addition unit counts…” [Line 18].
Claim 6; recites the limitation, “the addition unit reduces…” [Line 20].
Claim 7; recites the limitation, “a providing unit provides…” [line 10].
Claim 7; recites the limitation, “a calculation unit performs calculation…” [Line 13].
Claim 7; recites the limitation, “an addition unit performs addition …” [Line 16].
Claim 7; recites the limitation, “a deletion unit performs deletion…” [Line 19].
Claim 7; recites the limitation, “an output unit outputs…” [Line 22].

Claims 1-7 respectively invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. The specification is devoid of adequate structure to perform the claimed functions. The specification does not provide sufficient details such that one of the ordinary skill in the art would understand which structure performed(s) the claimed function.   
Therefore, the claim is indefinite and is rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph.
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph; 
(b)        Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(c)        Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.

The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a)  IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1-7 along with their dependent claims are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement.  The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for pre-AIA  the inventor(s), at the time the application was filed, had possession of the claimed invention.  As described above, the disclosure does not provide adequate structure to perform the claimed function in the recited limitation.
Claim 1; recites the limitation, “a providing unit configured to provide…” [Line 12].
Claim 1; recites the limitation, “a calculation unit configured to perform calculation…” [Line 15].
Claim 1; recites the limitation, “an addition unit configured to perform addition…” [Line 19].
Claim 1; recites the limitation, “a deletion unit configured to perform deletion…” [Line 22].
Claim 1; recites the limitation, “an output unit configured to output…” [Line 25].
Claim 2; recites the limitation, “the calculation unit calculates…” [Line 2].
Claim 3; recites the limitation, “the calculation unit uses…” [Line 2].
Claim 4; recites the limitation, “a motion detection unit that detects…” [Lines 2-3].
Claim 4; recites the limitation, “the calculation unit adjusts…” [Line 4].
Claim 5; recites the limitation, “a NMS processing unit configured to perform…” [Line 2].
Claim 5; recites the limitation, “the calculation unit calculates…” [Line 9].
Claim 6; recites the limitation, “the addition unit counts…” [Line 18].
Claim 6; recites the limitation, “the addition unit reduces…” [Line 20].
Claim 7; recites the limitation, “a providing unit provides…” [line 10].
Claim 7; recites the limitation, “a calculation unit performs calculation…” [Line 13].
Claim 7; recites the limitation, “an addition unit performs addition …” [Line 16].
Claim 7; recites the limitation, “a deletion unit performs deletion…” [Line 19].
Claim 7; recites the limitation, “an output unit outputs…” [Line 22].
The specification does not demonstrate that applicant has made an invention that achieves the claimed function because the invention is not described with sufficient detail such that one of ordinary skill in the art can reasonably conclude that the inventor had possession of the claimed invention.


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-2 and 7 are rejected under 35 U.S.C. 102 (a)(1)/(a)(2) as being unpatentable over FUKAGAI (US 20190050994 A1), herein after referenced as FUKAGAI. 

Regarding Claim 1, FUKAGAI explicitly teaches an object detection device (Fig. 1, #1 called a detecting device. Paragraph [0046]), which performs object detection processing to detect an area where a target object is present in images captured by a camera (Fig. 1. Paragraph [0046]-FUKAGAI discloses the detecting device 1 according to one embodiment enables, by a method to be described later, recognition and tracking of a moving object moving within an image of an image series, for example, a plurality of images (which may be referred to as “frames”) arranged in a time series of moving images (wherein the moving object is the target object.).), the object detection device comprising:
a memory (Fig. 1, #11 and #17 called memory units) configured to store a bounding box indicating the area where the target object is present (Fig. 1, 7, and 9. Paragraph [0047]-FUKAGAI discloses the memory unit 11 may store image data 111 (for example, data of an image series) as an example of data input to the detecting device 1, and the memory unit 17 may store a recognition result 171 as an example of data output from the detecting device 1. The memory units 11 and 17 may be implemented by at least a part of a storage area of hardware such as a memory or a storage device provided to a computer operating as the detecting device 1. Incidentally, the memory units 11 and 17 may be integrated and managed as one memory unit (wherein the recognition result is the bounding box).), the bounding box being output to a comparison image of the images (Fig 3. Paragraph [0079]-FUKAGAI discloses the input image provided with the recognition result output from the estimation result selecting unit 16 is stored and accumulated as a recognition result 171 in the memory unit 17.);
a processor (Fig. 29, #10a called a processor/CPU) configured to execute program codes or commands stored in the memory (Fig. 29. Paragraph [0266]-FUKAGAI discloses the processor 10a is an example of an arithmetic processing device that performs various kinds of control and operation. The processor 10a may be mutually communicatably coupled to each of the blocks 10b to 10f by a bus 10i. An integrated circuit (IC) such as a CPU, a GPU, an MPU, a DSP, an ASIC, or a PLD (for example, an FPGA) may be used as the processor 10a.);
an acquisition unit (Fig. 29, #10a called a processor) configured to acquire a detection target image of the images (Fig. 9. Paragraph [0048]- FUKAGAI discloses the image preprocessing unit 12 obtains an input image on an image-by-image (frame-by-frame) basis from the image data 111 of the memory unit 11, and performs preprocessing on the input image.), the detection target image being captured at a time after the comparison image is captured (Fig. 9, illustrates the use of multiple images taken at times t=1, t=2, and t=3 (wherein the image at t=3 is the detection target image and images t=1 and t=2 are comparison images.). Paragraph [0129]-FUKAGAI discloses the obtaining unit 162 may use, at time t=3, the information of the predicted presence region at t=3, the information being estimated by the score correcting unit 161 at time t=2, for example, based on the positional information of the moving object at t=1 and t=2.);
a providing unit configured to provide a plurality of rectangular frames each indicating a candidate for the area where the target object is present in the detection target image (Fig. 3 and 9. Paragraph [0060]-FUKAGAI discloses as illustrated in FIG. 3, the RPN layer 140 analyzes the feature maps based on the feature maps, and calculates and outputs the proposed regions of the feature maps. The proposed regions are an example of candidate regions in which objects are likely to be present. (wherein the candidate regions are rectangular frames));
a calculation unit configured to perform calculation processing to calculate a similarity score that is an index indicating a degree of similarity between the bounding box in the comparison image and the rectangular frame in the detection target image (Fig. 3. Paragraph [0064]- FUKAGAI discloses the proposed regions may include “scores” indicating likelihood of objects being present in the regions. In the example illustrated in FIG. 3, a score is expressed as a numerical value to three decimal places (the larger the numerical value, the higher the likelihood) with “1.000” as a maximum value. The number of proposed regions output from the RPN layer 140 may be limited to proposed regions whose scores have given numerical values (for example, numerical values equal to or more than “0.800”) (Wherein the likelihood score is a similarity score.).);
an addition unit configured to perform addition processing to increase a confidence score of a rectangular frame, out of the plurality of rectangular frames, having the similarity score that satisfies a predetermined condition (Fig. 14. Paragraph [0150]-FUKAGAI discloses the score correcting unit 161 calculates correction information that increases the scores of detected regions close to the estimated predicted presence region (step S22). Incidentally, the correction information may be information that increases the scores of proposed regions close to (for example, having a large overlap with) the predicted region. The correction information may be, for example, the scores themselves after correction of the proposed regions, or may be, for example, coefficients for weighting that makes the scores higher than the scores of other proposed regions.);
a deletion unit configured to perform deletion processing to delete a rectangular frame, out of the plurality of rectangular frames, having the confidence score less than a confidence score threshold after the addition processing (Fig. 28. Paragraph [0245]-FUKAGAI discloses FIG. 28 is a diagram of assistance in explaining an example of generation and pruning of hypotheses over a plurality of frames by MHT. The score correcting unit 161 may repeatedly generate assignment hypotheses over a plurality of frames. At this time, the score correcting unit 161 may select hypotheses in which likelihood of a combination of assignment hypotheses is highest as an estimation result from a latest hypothesis tree. Incidentally, the hypothesis tree may be pruned in appropriate timing according to limitations of computing resources such as a memory and a processor.); and
an output unit configured to output a rectangular frame remaining after the deletion processing as the bounding box for the comparison image (Fig. 6 and 9, illustrate outputting an image with a bounding box, called a recognition result. Paragraph [0047]-FUKAGAI discloses the memory unit 11 may store image data 111 (for example, data of an image series) as an example of data input to the detecting device 1, and the memory unit 17 may store a recognition result 171 as an example of data output from the detecting device 1. Further in Paragraph [0079]-FUKAGAI discloses the input image provided with the recognition result output from the estimation result selecting unit 16 is stored and accumulated as a recognition result 171 in the memory unit 17.). 

Regarding Claim 2, FUKAGAI explicitly teaches the object detection device according to claim 1, FUKAGAI teaches wherein the calculation unit calculates the similarity score using at least one of (Fig. 3. Paragraph [0064]- FUKAGAI discloses the proposed regions may include “scores” indicating likelihood of objects being present in the regions. In the example illustrated in FIG. 3, a score is expressed as a numerical value to three decimal places (the larger the numerical value, the higher the likelihood) with “1.000” as a maximum value. The number of proposed regions output from the RPN layer 140 may be limited to proposed regions whose scores have given numerical values (for example, numerical values equal to or more than “0.800”) (Wherein the likelihood score is a similarity score.).):
a position score that is the index indicating the degree of similarity between a position of the bounding box in the comparison image and a position of the rectangular frame in the detection target image (Fig. 9 and 11. Paragraph [0260]-FUKAGAI discloses the weighting may be performed based on weight information as an example of movement information, for example, weights such as the “distances from the predicted position,” the “Mahalanobis distance,” the “scores of the detected regions,” or the “likelihood of tracking,”.” In other words, the score correcting unit 161 may be said to perform weighting based on the movement information for indexes of candidate regions present within the gate region having, as a center thereof, a region in which an object may be present in an input image to be input next, the region being obtained by the tracking filter.);
an aspect ratio score that is the index indicating the degree of similarity between an aspect ratio of the bounding box in the comparison image and an aspect ratio of the rectangular frame in the detection target image (Fig. 17. Paragraph [0177]-FUKAGAI discloses the score correcting unit 161 may set a variable range r of the central position such that (Δx.sup.2+Δy.sub.2).sup.1/2<r, and set variable ranges of the width and height of the region such that |Δw|<Δw_max and |Δh| <Δh_max. Incidentally, the value of r and the values of Δw_max and Δh_max may be values adjustable based on the kind of the detected object, a frame rate at a time of observation, the dimensions (w, h) of the detected rectangular region, and the like. Alternatively, the value of r and the values of Δw_max and Δh_max may be fixed values complying with a given rule (wherein the variable range is the aspect ratio.).); and
an area score that is the index indicating the degree of similarity between an area of the bounding box in the comparison image and an area of the rectangular frame in the detection target image (Fig. 17. Paragraph [0178]-FUKAGAI discloses the score correcting unit 161 may evaluate the size of an overlap between the detected region of an object in a certain frame and the detected region of the object in a previous frame by an intersection of union (IoU) value, and determine that the same object is detected when the IoU value is equal to or more than a certain threshold value (wherein the size of overlap is an area score.).).

Regarding Claim 7, FUKAGAI teaches an object detection device (Fig. 1, Paragraph [0046]), which performs object detection processing to detect an area where a target object is present in images captured by a camera (Fig. 1. Paragraph [0046]-FUKAGAI discloses the detecting device 1 according to one embodiment enables, by a method to be described later, recognition and tracking of a moving object moving within an image of an image series, for example, a plurality of images (which may be referred to as “frames”) arranged in a time series of moving images (wherein the moving object is the target object.).), the object detection device comprising:
a memory (Fig. 1, #11 and #17 called memory units) stores a bounding box indicating the area where the target object is present (Fig. 1, 7, and 9. Paragraph [0047]-FUKAGAI discloses the memory unit 11 may store image data 111 (for example, data of an image series) as an example of data input to the detecting device 1, and the memory unit 17 may store a recognition result 171 as an example of data output from the detecting device 1. The memory units 11 and 17 may be implemented by at least a part of a storage area of hardware such as a memory or a storage device provided to a computer operating as the detecting device 1. Incidentally, the memory units 11 and 17 may be integrated and managed as one memory unit (wherein the recognition result is the bounding box).), the bounding box being output to a comparison image of the images (Fig 3. Paragraph [0079]-FUKAGAI discloses the input image provided with the recognition result output from the estimation result selecting unit 16 is stored and accumulated as a recognition result 171 in the memory unit 17.);
a processor (Fig. 29, #10a called a processor/CPU) executes program codes or commands stored in the memory (Fig. 29. Paragraph [0266]-FUKAGAI discloses the processor 10a is an example of an arithmetic processing device that performs various kinds of control and operation. The processor 10a may be mutually communicatably coupled to each of the blocks 10b to 10f by a bus 10i. An integrated circuit (IC) such as a CPU, a GPU, an MPU, a DSP, an ASIC, or a PLD (for example, an FPGA) may be used as the processor 10a.);
an acquisition unit (Fig. 29, #10a called a processor) acquires a detection target image of the images (Fig. 9. Paragraph [0048]- FUKAGAI discloses the image preprocessing unit 12 obtains an input image on an image-by-image (frame-by-frame) basis from the image data 111 of the memory unit 11, and performs preprocessing on the input image.), the detection target image being captured at a time after the comparison image is captured (Fig. 9, illustrates the use of multiple images taken at various times t=1, t=2, and t=3 (wherein the image at t=3 is the detection target image and images t=1 and t=2 are comparison images.). Paragraph [0129]-FUKAGAI discloses the obtaining unit 162 may use, at time t=3, the information of the predicted presence region at t=3, the information being estimated by the score correcting unit 161 at time t=2, for example, based on the positional information of the moving object at t=1 and t=2.);
a providing unit provides a plurality of rectangular frames each indicating a candidate for the area where the target object is present in the detection target image (Fig. 3 and 9. Paragraph [0060]-FUKAGAI discloses as illustrated in FIG. 3, the RPN layer 140 analyzes the feature maps based on the feature maps, and calculates and outputs the proposed regions of the feature maps. The proposed regions are an example of candidate regions in which objects are likely to be present);
a calculation unit performs calculation processing to calculate a similarity score () that is an index indicating a degree of similarity between the bounding box in the comparison image and the rectangular frame in the detection target image (Fig. 3. Paragraph [0064]- FUKAGAI discloses the proposed regions may include “scores” indicating likelihood of objects being present in the regions. In the example illustrated in FIG. 3, a score is expressed as a numerical value to three decimal places (the larger the numerical value, the higher the likelihood) with “1.000” as a maximum value. The number of proposed regions output from the RPN layer 140 may be limited to proposed regions whose scores have given numerical values (for example, numerical values equal to or more than “0.800”) (Wherein the likelihood score is a similarity score.).);
an addition unit performs addition processing to increase a confidence score of a rectangular frame, out of the plurality of rectangular frames, having the similarity score that satisfies a predetermined condition (Fig. 14. Paragraph [0150]-FUKAGAI discloses the score correcting unit 161 calculates correction information that increases the scores of detected regions close to the estimated predicted presence region (step S22). Incidentally, the correction information may be information that increases the scores of proposed regions close to (for example, having a large overlap with) the predicted region. The correction information may be, for example, the scores themselves after correction of the proposed regions, or may be, for example, coefficients for weighting that makes the scores higher than the scores of other proposed regions.);
a deletion unit performs deletion processing to delete a rectangular frame, out of the plurality of rectangular frames, having the confidence score less than a confidence score threshold after the addition processing (Fig. 28. Paragraph [0245]-FUKAGAI discloses FIG. 28 is a diagram of assistance in explaining an example of generation and pruning of hypotheses over a plurality of frames by MHT. The score correcting unit 161 may repeatedly generate assignment hypotheses over a plurality of frames. At this time, the score correcting unit 161 may select hypotheses in which likelihood of a combination of assignment hypotheses is highest as an estimation result from a latest hypothesis tree. Incidentally, the hypothesis tree may be pruned in appropriate timing according to limitations of computing resources such as a memory and a processor.); and
an output unit outputs a rectangular frame remaining after the deletion processing as the bounding box for the comparison image (Fig. 6 and 9, illustrate outputting an image with a bounding box, called a recognition result. Paragraph [0047]-FUKAGAI discloses the memory unit 11 may store image data 111 (for example, data of an image series) as an example of data input to the detecting device 1, and the memory unit 17 may store a recognition result 171 as an example of data output from the detecting device 1. Further in Paragraph [0079]-FUKAGAI discloses the input image provided with the recognition result output from the estimation result selecting unit 16 is stored and accumulated as a recognition result 171 in the memory unit 17.).

Allowable Subject Matter
Claims 3-6, along with their dependent claims, are therefrom objected to as being dependent upon rejected base claim, claim 1, respectively but would be allowable if rewritten in independent form including all of the limitations of the base claims and any intervening claims, once the specification and claim objections along with the 112 (a) and 112 (b) rejections are overcome.
The following is a statement of reasons for the indication of allowable subject matter:
Regarding Claim 3, the prior arts fail to explicitly teach, the calculation unit uses a product of the position score, the aspect ratio score for the similarity score. 
Regarding Claim 4, the prior arts fail to explicitly teach, the calculation unit adjusts importance of the position score, the aspect ratio score, and the area score based on the motion of the vehicle detected by the motion detection unit. 
Regarding claim 5, the prior arts fail to explicitly teach, wherein the calculation unit calculates the similarity score between the bounding box output to the comparison image and the rectangular frames remaining after the NMS processing as the calculation processing.
Regarding claim 6, the prior arts fail to explicitly teach, the memory stores whether one of the rectangular frames to be the bounding box output to the comparison image is a low-confidence rectangular frame having the confidence score lower than the confidence score threshold before the addition processing or a high confidence rectangular frame having the confidence score equal to or higher than the confidence score threshold before the addition processing,
a series of low confidence is a state in which the rectangular frames having the similarity score that satisfies the predetermined condition is the low-confidence rectangular frame, and the one of the rectangular frames to be output as the bounding box to the comparison image used for calculating the similarity score between the rectangular frame having the similarity score that satisfies the predetermined condition and the one of the rectangular frames to be output as the bounding box is the low-confidence rectangular,
the addition unit counts the number of consecutive times of the series of low confidence over a plurality of times of the object detection processing, and
the addition unit reduces an increase amount of the confidence score in the addition processing depending on the number of consecutive counts.

Conclusion
Listed below are the prior arts made of record and not relied upon but are considered pertinent to applicant’s disclosure. 

KONISHI (US 20190308320 A1) – An object recognition processing apparatus includes: a model data acquisition unit configured to acquire three-dimensional model data of an object; a measurement unit configured to acquire measurement data including three-dimensional position information of the object; a position/orientation recognition unit configured to recognize a position/orientation of the object based on the three-dimensional model data and the measurement data; a similarity score calculation unit configured to calculate a similarity score indicating a degree of similarity between the three-dimensional model data and the measurement data in a position/orientation recognition result of the object; a reliability calculation unit configured to calculate an index indicating a feature of a three-dimensional shape of the object, and calculate a reliability of the similarity score based on the index; and an integrated score calculation unit configured to calculate an integrated score indicating a quality of the position/orientation recognition result based on the similarity score and the reliability……Abstract.
SHIRAI et al. (US 20200082184 A1) – An object detection device acquires a plurality of images captured by a camera mounted in a moving body and acquires movement information of the camera. The object detection device extracts an edge from a first image among the acquired images. The object detection device calculates a plurality of candidate points corresponding to the edge in a second image, based on the movement information and calculates candidate points corresponding to a plurality of heights as height candidates of the edge. The object detection device calculates a correlation index value between the edge and each of the plurality of candidate points and detects the candidate point with the largest correlation as a correspondence point of the edge. The object detection device detects an object at a position of the edge when the correlation index value of the detected correspondence point is equal to or smaller than a predetermined threshold.. ……Abstract.
IKEDA (US 20180286077 A1) – An object counting system includes an acquisition means for acquiring information of an estimation area which is a partial area of an image with which partial area a predetermined condition related to objects to be counted shown in the image are associated, the estimation area being a unit of area for estimating the number of the objects to be counted, a setting means for setting the estimation area in the image in such a manner that the estimation area indicated by the acquired information of the estimation area includes the objects to be counted which are not included in the objects to be counted in a different estimation area, and which satisfy the predetermined condition, an estimation means for estimating, in each estimation area, the number of the objects to be counted shown in the estimation area set in the image, and a computation means for computing a density of the objects to be counted in an area where predetermined areas in the estimation area are overlapped using the number of the objects to be counted that has been estimated in each estimation area. ……Abstract.
MIKURIYA et al. (US 20170278250 A1) -– An image processing device is provided with an extracting unit that extracts a region from an image in accordance with an operation of an operator, an adding unit that adds a frame to the region, a removing unit that removes black pixels connected to the frame, and an output unit that outputs a process result of the removing unit. ……Abstract.
CHEN et al. (US 20180342070 A1) – Techniques and systems are provided for maintaining blob trackers for one or more video frames. For example, a blob tracker can be identified for a current video frame. The blob tracker is associated with a blob detected for the current video frame, and the blob includes pixels of at least a portion of one or more objects in the current video frame. One or more characteristics of the blob tracker are determined. The one or more characteristics are based on a bounding region history of the blob tracker. A confidence value is determined for the blob tracker based on the determined one or more characteristics, and a status of the blob tracker is determined based on the determined confidence value. The status of the blob tracker indicates whether to maintain the blob tracker for the one or more video frames. For example, the determined status can include a first type of blob tracker that is output as an identified blob tracker-blob pair, a second type of blob tracker that is maintained for further analysis, or a third type of blob tracker that is removed from a plurality of blob trackers maintained for the one or more video frames. ……Abstract.
SHU et al. (US 20210012527 A1) – An image processing method is provided for an electronic device. The method includes obtaining a target image comprising a target object; recognizing target two-dimensional location coordinates of the target object in the target image and a target attribute type corresponding to the target object; and obtaining a target three-dimensional point cloud associated with the target image. The method also include, according to a mapping relationship between the target three-dimensional point cloud and all pixels in the target image, obtaining three-dimensional location coordinates corresponding to pixels in the target two-dimensional location coordinates, as target three-dimensional location coordinates; determining a setting region in three-dimensional map data according to the target three-dimensional location coordinates; and setting the target attribute type for the target object in the setting region. Electronic device and non-transitory computer-readable storage medium counterparts are also contemplated. ……Abstract.
HIRAMA et al. (US 20210090267 A1) – An information processing system includes a first information processing apparatus including a first camera and configured to extract a first area from a first image acquired by the first camera, acquire a first feature value of the first area, determine a first location corresponding to the first area using the first image and a current location of the first information processing apparatus, and transmit the first feature value and the determined first location, and a second information processing apparatus including a second camera and configured to receive the first feature value and the first location, acquire a second image using the second camera, determine whether the first area is included in the second image using the first feature value, and upon determining that the first area is included in the second image, determine a current location of the second information processing apparatus using the second image and the first location. ……Abstract.
WANG et al. (US 20190065895 A1) – Techniques and systems are provided for prioritizing objects for object recognition in one or more video frames. For example, a current video frame is obtained, and a objects are detected in the current video frame. State information associated with the objects is determined. Priorities for the objects can also be determined. For example, a priority can be determined for an object based on state information associated with the object. Object recognition is performed for at least one object from the objects based on priorities determined for the at least one object. For instance, object recognition can be performed for objects having higher priorities before objects having lower priorities. ……Abstract.
AMIN et al. (US 12333793 B2) – A method for machine-based training of a computer-implemented network for common detecting, tracking, and classifying of at least one object in a video image sequence having a plurality of successive individual images. A combined error may be determined during the training, which error results from the errors of the determining of the class identification vector, determining of the at least one identification vector, the determining of the specific bounding box regression, and the determining of the inter-frame regression. ……Abstract.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ETHAN N WOLFSON whose telephone number is (571)272-1898. The examiner can normally be reached Monday - Friday 8:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chineyere Wills-Burns can be reached at (571) 272-9752. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/ETHAN N WOLFSON/Examiner, Art Unit 2673     

/CHINEYERE WILLS-BURNS/Supervisory Patent Examiner, Art Unit 2673
Read full office action
OBJECT DETECTION DEVICE

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

OBJECT DETECTION DEVICE

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email