DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on December 08, 2025 has been entered.
Status of Claims
Claims 1-10 and 16-25 are pending. Claims 11-15 are cancelled.
Response to Amendments
In light of the Applicant’s amendments of claims 1 and 16, the 112(a) rejections of record for claims 1-10 and 16-25 for written description requirements has been withdrawn.
Response to Arguments
Applicant’s amendments of independent claims 1 and 16, which has altered the scope of the claims of the instant application, has necessitated the new ground(s) of rejection presented in this office action with respect to claims of the instant application. Accordingly, in response to Applicant’s arguments, that are merely directed to the amended portion of the claims, new analyses have been presented below, which make Applicant’s arguments moot.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-10, and 16-25 are rejected under 35 U.S.C. 101, based on abstract idea. The claims recite an embedded device and method dividing an image into cells, and detecting either multiple objects or the background within each cell. With respect to independent system claim 16:
STEP 1: Do the claims fall within one of the statutory categories?
YES. Claim 16 is directed to a device i.e., a system or a machine.
STEP 2A (PRONG 1): Is the claim directed to a law of nature, a natural phenomenon or an abstract idea?
YES, the claims are directed toward a mental process (i.e., abstract idea).
The limitation “generate [..] a grid including multiple cells wherein a cell of the multiple cells of the grid maps to a region of one or more pixels in the image“, “detect, in a first cell of the multiple cells a centroid of an object from multiple objects that are detectable classes distinct from one another” “detect, in a second cell of the multiple cells, a background based on none of the multiple objects being detected in the second cell” and “generate a bounding box for the first cell as an object detection output for the object based on detecting the centroid of the object in the cell, wherein a size of the bounding box corresponds to a size of the cell” as drafted, recite an abstract idea, such as a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind of a person, i.e., concepts performed in the human mind (including observation, evaluation, judgement, opinion).
As such, a person could divide image(s) into a grid and detect whether each cell within the grid contain background or a center of an object with a degree of error or lack thereof either mentally or using a pen and paper. The mere nominal recitation that the various steps are being executed by a processor (e.g., processing unit) does not take the limitations out of the mental process grouping. Thus, the claims recite a mental process.
STEP 2A (PRONG 2): Does the claim recite additional elements that integrate the judicial exception into a practical application?
NO, the claims do not recite additional elements that integrate the judicial exception into a practical application.
The additional elements “a memory”, “a process”, “process an image using a neural network”, and “as result of processing the image using the neural network”, are recited at a high level of generality and merely equate to “apply it” or otherwise merely uses a generic computer as a tool to perform an abstract which are not indicative of integration into a practical application as per MPEP 2106.05(f). See also MPEP 2106.04(a)(2)(III) with respect to Mental Processes: “Nor do the courts distinguish between claims that recite mental processes performed by humans and claims that recite mental processes performed on a computer”. See also MPEP 2106.04(a)(2)(III)(C)(3) Using a computer as tool to perform a mental process and MPEP 2106.04(a)(2)(III)(D) as well as the case law cited therein.
The additional elements “output the bounding box as the object detection output for the object” are recited as merely amounting to necessary data gathering and post-solution activity, which are not indicative of integration into a practical application as per MPEP 2106.05(g) Insignificant Extra-Solution Activity.
STEP 2B: Does the claim recite additional elements that amount to significantly more than the judicial exception?
NO,
The claims herein do not include additional elements that are sufficient to amount to significantly more than the judicial exception, because as discussed above with respect to integration of the abstract idea into practical application, the additional step/element/limitation amounts to no more than an abstract idea performed on a computer. The additional elements are simply appending well-understood routine, conventional activities previously known in the industry, specified at a high level of generality, to the judicial exception (WURC) per MPEP 2106.05(d) and 2106.07(a)(III). Therefore, claim 16 is not patent eligible.
In addition, the elements of claims 1 are analyzed in the same manner as claim 16. Therefore independent claims 1, and 16 are not patent eligible, either.
Similar analysis is made for the dependent claims 2-10 and 17-25, under their broadest reasonable interpretation are identified as: being either directed towards mere data gathering or an abstract idea, mental process and mathematical calculation, and not reciting additional elements that integrate the judicial exception into a practical application, and not reciting additional elements that amount to significantly more than the judicial exception.
For all of the above reasons, claims 1-10, and 16-25 are: (a) directed toward an abstract idea, (b) do not recite additional elements that integrate the judicial exception into a practical application, and (c) do not recite additional elements that amount to significantly more than the judicial exception, claims 1-10 and 16-25 are not eligible subject matter under 35 U.S.C 101.
Claim Interpretation
Claims 8 and 23 have been given the following interpretation under broadest reasonable interpretation in light of the specification. Consider claim 8, similarly claim 23, recites arbitrary sizes of memory, input image resolution, and rate of processing frames. The specification does not provide the significance of these specific values. However, the specification does state “For example, each of the multiple images 1700 could have a resolution of at least 96 pixels by 96 pixels with each pixel being red, blue, or green (e.g., a 96x96x3 dataset). An object detection system (e.g., the object detection system 1500) could be implemented on an embedded device (e.g., a microcontroller), using less than 100 kB of memory, to detect the centroids in the multiple images 1700 at a rate of at least 10 frames per second. The multiple images 1700 could be received from a camera connected to the embedded device that is streaming the images at a continuous video frame rate”, these values are illustrated by way of example and one of ordinary skill in the art may consider any memory, input image size, and processing rate. For completeness and compact prosecution Examiner has interpretated claims 8 and 23 as an embedded device using memory to process multiple frames of an input image.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1, 3, 5-6, 9, 16, 18, 20-21, and 24 are rejected under 35 U.S.C. 103 as obvious over Johnston (“Tutorial: Running YOLOv5 Machine Learning Detection on a Raspberry Pi 4” - Publication Year 2021), in view of Fung et al. (US 2019/0012551 A1), in further view of Datahacker (“#025 CNN Bounding Box Predictions” – Publication Year 2018), and in further view of Hui (“Real-time Object Detection with YOLO, YOLOv2 and now YOLOv3” - Publication Year 2018).
Regarding claim 16, Johnston teaches “An embedded device of an object detection system, comprising: a memory; and a processor configured to execute instructions stored in the memory to (Johnston page 3 paragraph 1 "Embedded devices are computer systems that have all their RAM, CPU’s, and other accessories attached to a single board. The Raspberry Pi 4, for instance, is an embedded system with all processing components, USB slots, power ports, and much more built-in that allows it to run as a tiny computer for many purposes"):
output the bounding box as the object detection output for the object (Johnston page 10 paragraph 1 "And in the interference/output/folder, you should see multiple pictures including this one" and page 11 Figure shows the output being bounding boxes of the image).”
PNG
media_image1.png
270
325
media_image1.png
Greyscale
Johnston Page 11 Figure
Johnston does not explicitly teach “process an image using a neural network; generate as result of processing the image using the neural network, a grid including multiple cells, an object detection output for the object based on detecting the centroid of the object in the cell”.
However, in an analogous field of endeavor Fung teaches “process an image using a neural network (Fang paragraph [0075] "The neural network processing unit 124 can be configured to receive the image data for further processing. As an illustrative example, with reference to FIG. 2D, the image 216 can be configured 416x416 pixel image that can be provided to the neural network 122 to complete YOLO object detection in a single pass");
generate based on as result of processing an image using a neural network the image using the neural network, a grid including multiple cells, wherein the grid is separate from the image, and wherein a cell of the multiple cells of the grid maps to a region of one or more pixels in the image (Fang paragraph [0076] "The neural network processing unit 124 can be configured to process the image data and divide the image into a grid. The grid can include an SxS number of cells that each can be further processed to predict bounding boxes, as discussed below")”.
PNG
media_image2.png
687
568
media_image2.png
Greyscale
Fung Figure 4
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify the embedded system using YOLO algorithm as taught by Johnston to include the details of YOLO (You Only Look Once) as described by Fung because such a modification is the result of applying a known technique/algorithm to a known device ready for improvement to yield predictable results. The application of YOLO, a known algorithm is applicable to the embedded system object detection as they both share characteristics and capabilities, namely, they are directed to object detection and improving current methods such as “common detection systems [using] a sliding window approach to object detection that can be inefficient and time consuming” as described by Fung disclosure in paragraph 2. Therefore, it would have been recognized that modifying the embedded system to include the object detection algorithm of YOLO would have yielded predictable results because (i) the level of ordinary skill in the art demonstrated by the references applied shows the ability to incorporate YOLO algorithm to an embedded device and (ii) the benefits of such a combination would have been recognized by those of ordinary skill in the art.
The combination of Johnston and Fung does not explicitly teach “detect, in a first cell of the multiple cells, a centroid of an object from multiple objects that are detectable classes distinct from one another; detect, in a second cell of the multiple cells, a background based on none of the multiple objects being detected in the second cell; generate a bounding box for the first cell as an object detection output for the object based on detecting the centroid of the object in the cell”.
However, Datahacker teaches “detect, in a first cell of the multiple cells, a centroid of an object from multiple objects that are detectable classes distinct from one another (Datahacker page 5 left hand figure and paragraph 3 "Subsequently, this analyzed image has two objects which are located in the remaining six grid cells. And what the YOLO algorithm does, it takes the midpoint of each of the two objects and then assigns the object to the grid cell that contains the midpoint. So, the left car is assigned to the green grid cell, whereas the car on the right is assigned to the orange grid cell");
PNG
media_image3.png
618
669
media_image3.png
Greyscale
Datahacker Page 5 left hand Figure
detect, in a second cell of the multiple cells, a background based on none of the multiple objects being detected in the second cell (Datahacker page 5 left hand figure and paragraph 2 "Let's start with the upper left grid cell, so for each grid cell we can define a vector […] Let's start with the upper left grid cell. For this grid cell, we that there is no object present. So, the label vector y for the upper left grid cell will be Pc = 0");
(Datahacker page 5 left hand figure and paragraph 3 "Subsequently, this analyzed image has two objects which are located in the remaining six grid cells. And what the YOLO algorithm does, it takes the midpoint of each of the two objects and then assigns the object to the grid cell that contains the midpoint. So, the left car is assigned to the green grid cell, whereas the car on the right is assigned to the orange grid cell"),
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify the object detection system as taught by Johnston and Fung to include the underlying algorithm of YOLO as taught by Datahacker because such a modification is the result of applying a known technique/algorithm to a known device ready for improvement to yield predictable results. More specifically, “A good way to get the more accurate output bounding boxes is with the YOLO algorithm” as stated by Datahacker on page 3 paragraph 3. This known benefit is applicable to the embedded system object detection as they both share characteristics and capabilities, namely, they are directed to object detection. Additionally, Datahacker is merely providing additional details to the algorithm described in the Fung disclosure. Therefore, it would have been recognized that modifying embedded system object detection device as taught by Johnston and Fung to overlay a grid and identifying a background in the cell if no object is detected would have yielded predictable results because (i) the level of ordinary skill in the art demonstrated by the references applied shows the ability to incorporate all of the YOLO algorithm for object detection and (ii) the benefits of such a combination would have been recognized by those of ordinary skill in the art.
The combination of Johnston, Fung, and Datahacker does not explicitly teach “generate a bounding box [and] wherein a size of the bounding box corresponds to a size of the cell”.
However, Hui teaches “generate a bounding box for the first cell as an object detection output for the object based on detecting the centroid of the object in the cell (Hui page 6 Figure and paragraph 1 "Each grid cell predicts a fixed number of boundary boxes. In this example, the yellow grid cell makes two boundary box predictions (blue boxes) to locate where the person is"), wherein a size of the bounding box corresponds to a size of the cell (Hui Figure on page 6 shows both bounding boxes, yellow and blue, the yellow bounding box corresponds to the size of the cell”.
PNG
media_image4.png
469
800
media_image4.png
Greyscale
Hui Page 6 Figure
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify object detection system using YOLO algorithm taught by Johnston, Fung, and Datahacker to include the additional details of YOLO (You Only Look Once) as described by Hui because such a modification is the result of applying a known technique/algorithm to a known device ready for improvement to yield predictable results. More specifically, YOLO “outperforms other methods when generalizing from natural images to other domains like artwork [..] YOLO demonstrates fewer false positives in background area” as described by Hui on page 16. This known benefit is applicable to the embedded system object detection device as they both share characteristics and capabilities, namely, they are directed to object detection. Therefore, it would have been recognized that modifying the object detection system using YOLO algorithm to further include features of the bounding box corresponding to the grid cell size as it would have yielded predictable results because (i) the level of ordinary skill in the art demonstrated by the references applied shows the ability to the features of YOLO to the object detection system and (ii) the benefits of such a combination would have been recognized by those of ordinary skill in the art.
Therefore, it would have been obvious to combine the disclosure of Johnston, Fung, Datahacker with the Hui disclosure to obtain the invention as specified in claim 16 as there is reasonable expectation of success and/or doing so merely combines prior art elements according to known methods to yield predictable results.
Claim 1 recites a method with steps corresponding to the device elements recited in claim 16. Therefore, the recited steps of this claim are mapped to the proposed combination in the same manner as the corresponding elements of device claim 16. Additionally, the rationale and motivation to combine the Johnston, Fung, Datahacker, and Hui references, presented in rejection of claim 16 apply to this claim.
Consider claim 18 (similarly claim 3), the combination of Johnston, Fung, Datahacker, and Hui discloses “The method of claim 1, wherein the object detection system uses a neural network implementing one or more convolutional layers that, for each cell of the multiple cells, reduces a resolution of the region of one or more pixels (Hui page 27 paragraph 2 and page 28 paragraph 1 "YOLO adopts a different approach called passthrough. It reshapes the 26 × 26 × 512 layer to 13 × 13 ×2048. Then it concatenates with the original 13 × 13 ×1024 output layer. Now we apply convolution filters on the new 13 × 13 × 3072 layer to make predictions").”
The proposed combination as well as the motivation for combining Johnston and Hui references presented in the rejection of claim 16, applies to claim 18. Finally the device recited in claim 18 is met by Johnston, Fung, Datahacker, and Hui.
Consider claim 20 (similarly claim 5), the combination of Johnston, Fung, Datahacker, and Hui discloses “the embedded device of claim 16, wherein the processor is configured to assign a bounding box to a cell when detecting the centroid of the one of the multiple objects in the cell (Datahacker page 5 left hand figure and paragraph 3 "Subsequently, this analyzed image has two objects which are located in the remaining six grid cells. And what the YOLO algorithm does, it takes the midpoint of each of the two objects and then assigns the object to the grid cell that contains the midpoint. So, the left car is assigned to the green grid cell, whereas the car on the right is assigned to the orange grid cell"), wherein a size of the bounding box corresponds to a size of the cell (Hui Figure on page 6 shows both bounding boxes, yellow and blue, the yellow bounding box corresponds to the size of the cell).”
The proposed combination as well as the motivation for combining Johnston, Fung, Datahacker, and Hui references presented in the rejection of claim 16, applies to claim 20. Finally the device recited in claim 20 is met by Johnston, Fung, Datahacker, and Hui.
Regarding claim 21 (similarly claim 6), the combination of Johnston, Fung, Datahacker, and Hui discloses “The embedded device of claim 16, wherein the processor is configured to: configure the object detection system to detect the background (Datahacker page 5 left hand figure and paragraph 2 "Let's start with the upper left grid cell, so for each grid cell we can define a vector […] Let's start with the upper left grid cell. For this grid cell, we that there is no object present. So, the label vector y for the upper left grid cell will be Pc = 0") or the one of the multiple objects in each cell of the multiple cells independently (Datahacker page 5 left hand figure and paragraph 3 "Subsequently, this analyzed image has two objects which are located in the remaining six grid cells. And what the YOLO algorithm does, it takes the midpoint of each of the two objects and then assigns the object to the grid cell that contains the midpoint. So, the left car is assigned to the green grid cell, whereas the car on the right is assigned to the orange grid cell") and in parallel with one another (Datahacker page 7 paragraph 7 "This is a convolutional implementation because we’re not assessing this algorithm nine times on the 3x3 grid or 361 times if we are using the 9x9 grid. Instead, this is one single convolutional evaluation, and that’s why this algorithm is very efficient").”
The proposed combination as well as the motivation for combining Johnston, Fung, Datahacker, and Hui references presented in the rejection of claim 16, applies to claim 21. Finally the device recited in claim 21 is met by Johnston, Fung, Datahacker, and Hui.
Regarding claim 24 (similarly claim 9), the combination of Johnston, Fung, Datahacker, and Hui teaches “The embedded device of claim 16, wherein the processor is configured to: configure the object detection system to determine a scale at which the image is divided into the multiple cells, wherein the scale is determined so that different objects of the multiple objects occupy different cells of the multiple cells (Datahacker page 7 paragraph 6 "Here we have used a relatively coarse 3x3 grid, in practice, we might use a much finer grid maybe 19x19. In that case we end up with 19x19x8 output. This step reduces the probability that we encounter multiple objects assigned to the same grid cell").”
Claims 2, 8, 17, and 23 are rejected under 35 U.S.C. 103 as obvious over Johnston, Fung, Datahacker, and Hui, in view of Redmon et al. (“You Only Look Once: Unified Real-Time Object Detection” – Publication Year 2016).
Regarding claim 17 (similarly claim 2), the combination of Johnston, Fung, Datahacker, and Hui teaches the device of claim 16. The combination of Johnston, Fung, Datahacker, and Hui does not explicitly teach wherein the processor is configured to use a neural network trained by a loss function that gives a greater weight to detecting the one of the multiple objects and a lesser weight to detecting the background.
However, Redmon teaches “wherein the processor is configured to used a neural network trained (Redmon page 3 left hand column paragraph 3 "We then convert the model to perform detection. Ren et al. show that adding both convolutional and connected layers to pretrained networks can improve performance") by a loss function that gives a greater weight to detecting the one of the multiple objects and a lesser weight to detecting the background (Redmon page 3 right hand column paragraphs 1 and 2 "Also, in every image many grid cells do not contain any object. This pushes the “confidence” scores of those cells towards zero, often overpowering the gradient from cells that do contain objects. This can lead to model instability, causing training to diverge early on. To remedy this, we increase the loss from bounding box coordinate predictions and decrease the loss from confidence predictions for boxes that don’t contain objects").”
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify system and method of object detection using an embedded device as taught by Johnston, Fung, Datahacker, and Hui to use a neural network trained by a loss function as taught by Redmon.
The suggestion/motivation for doing so would have been that, “YOLO trains on full images and directly optimizes detection performance. This unified model has several benefits over traditional methods of object detection. First YOLO is extremely fast. Since we frame detection as a regression problem we don’t need a complex pipeline” as disclosed by Redmon on page 1 right hand column paragraph 3-4. As well as, one of ordinary skill in the art is knowledgeable that more weight is given to the target object and less weight is given to the background for improved accuracy in object detection.
Therefore, it would have been obvious to combine the disclosure of Johnston, Fung, Datahacker, and Hui with the Redmon disclosure to obtain the invention as specified in claim 17 as there is a reasonable expectation of success and/or because doing so merely combines prior art elements according to known methods to yield predictable results.
Regarding claim 23 (similarly claim 8) the combination of Johnston, Fung, Datahacker, Hui and Redmon teaches “The embedded device of claim 16, wherein the processor is configure to: implement the object detection system on the embedded device using less than 100 kB of memory (Johnston page 3 paragraph 3 "The Raspberry Pi 4 is fully capable of running 64-bit operating systems, unlike the Raspberry Pi 3 which only supported 1 GB of RAM for such OS’s. Because of this, the RPi4 is now open to a massive amount of 64-bit applications that it had no access to beforehand, including YOLOv5. I installed and tested YOLOv5 using a 16 GB microSD card running 64-bit Ubuntu, but since Linux is so general it may run fine on other operating systems");
receive the image as a first frame of multiple frames (Fang paragraph [0073] "At block 302, the method 300 includes receiving an image from an imaging system. For example, the processor 110 can receive an image (e.g., the image 202, the image 212, the image 220, and the image 226, a frame of a video) from the imaging system 106") with each frame having a resolution of at least 96 pixels by 96 pixels (Fang paragraph [0075] "As an illustrative example, with reference to FIG. 2D, the image 216 can be configured 416x416 pixel image that can be provided to the neural network 122 to complete YOLO object detection in a single pass"); and
detect the background or the one of the multiple objects in each cell of the multiple frames at a rate of at least 10 frames per second (Redmon page 1 right hand column paragraph 4 "Our base network runs at 45 frames per second with no batch processing on a Titan X GPU and a fast version runs at more than 150 fps. This means we can process streaming video in real-time with less than 25 milliseconds of latency").
The proposed combination as well as the motivation for combining Johnston, Fung, Datahacker, Hui, and Redmon references presented in the rejection of claim 16, applies to claim 23. Finally the device recited in claim 23 is met by Johnston, Fung, Datahacker, Hui, and Redmon.
Claims 4, and 19 are rejected under 35 U.S.C. 103 as obvious over Johnston, Fung, Datahacker, and Hui in view of Rahman et al. (“A Real-Time Wrong-Way Vehicle Detection Based on YOLO and Centroid Tracking” - Publication Year 2020).
Regarding claim 19 (similarly claim 4), the combination of Johnston, Fung, Datahacker, and Hui teaches “The embedded device of claim 16, wherein the processor is configure to use a neural network to detect, in each cell of the multiple cells (Fung paragraph [0077] “upon processing the grid cells based on dividing of the image into the SxS grid, the neural network processing unit 124 can be configured to compute and predict bounding boxes for each of the grids of the grid cells”), data that is (Datahacker page 5 left hand figure and paragraph 2 "Let's start with the upper left grid cell, so for each grid cell we can define a vector […] Let's start with the upper left grid cell. For this grid cell, we that there is no object present. So, the label vector y for the upper left grid cell will be Pc = 0") or the one of the multiple objects (Datahacker page 5 left hand figure and paragraph 3 "Subsequently, this analyzed image has two objects which are located in the remaining six grid cells. And what the YOLO algorithm does, it takes the midpoint of each of the two objects and then assigns the object to the grid cell that contains the midpoint. So, the left car is assigned to the green grid cell, whereas the car on the right is assigned to the orange grid cell")
The combination of Johnston and Hui does not teach explicitly teach “data that is either expected data or anomalous data” and expected data is “within a range determined when training the neural network, and wherein the data is anomalous data when detecting the background or the one of the multiple objects outside of the range”.
However, Rahman teaches the expected data is “within a range determined when training the neural network, and wherein the data is anomalous data when detecting the background or the one of the multiple objects outside of the range (Rahman page 3 right hand column paragraph 2 "On the right side of the image [Fig. 6], it is seen that a pedestrian is detected by the YOLO but is not tracked. Because, we feed only the vehicle classes to the centroid tracker and remove others")”.
PNG
media_image5.png
270
351
media_image5.png
Greyscale
Rahman Figure 6
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify system and method of object detection using an embedded device as taught by Johnston, Fung, Datahacker, and Hui to use detect objects that are not of interest, but discard them from further processing as taught by Rahman.
The suggestion/motivation for doing so would have been that, “it was very difficult to use this technology due to the expensive computation and unavailability of data. But, the recent improvement of different machine learning and deep learning algorithm and availability of powerful Graphics Processing Unit (GPU) and the cheaper camera has made the whole system more efficient” as disclosed by Rahman page 1 right hand column paragraph 1.
Therefore, it would have been obvious to combine the disclosure of Johnston, Fung, Datahacker, and Hui with the Rahman disclosure to obtain the invention as specified in claim 19 as there is a reasonable expectation of success and/or because doing so merely combines prior art elements according to known methods to yield predictable results.
Claim 7 and 22 are rejected under 35 U.S.C. 103 as obvious over Johnston, Fung, Datahacker, and Hui in view of Kelcey (“Counting Bees” - Publication Year 2018).
Regarding claim 22 (similarly claim 7), the combination of Johnston, Fung, Datahacker, and Hui teaches the device of claim 16. The combination of Johnston, Fung, Datahacker, and Hui does not explicitly teach all of implementing “the object detection system on a microcontroller corresponding to the embedded device; and receiving the image from a camera connected to the microcontroller”.
However, Kelcey teaches implementing “the object detection system on a microcontroller corresponding to the embedded device; and
receiving the image from a camera connected to the microcontroller (Kelcey "a raspberry pi, a standard pi camera and a solar panel is a pretty simple").”
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify the system and method of object detection using an embedded device disclose by Johnston, Fung, Datahacker, and Hui to include a camera connected to the microcontroller of the embedded device because such a modification would have been obvious to try. More specifically, the raspberry pi and camera is one of a predictable and ascertainable group of similar features. This group addresses the design need and/or other recognized problem of real time object detection using a power constrained device with a reasonable level of success. Therefore, it would have been obvious to try to modify the system and method of object detection using an embedded device disclose by Johnston, Fung, Datahacker, and Hui to include a camera connected to the microcontroller of the embedded device since there are a finite number of identified, predictable potential solutions to the recognized need (as discussed above) and one of ordinary skill in the art could have pursued the known potential solutions with a reasonable expectation of success.
Claim 10 and 25 are rejected under 35 U.S.C. 103 as obvious over Johnston, Fung, Datahacker, Hui, and Redmon in view of Kwant et al. (US 2019/0073774).
Regarding claim 25 (similarly claim 10), the combination of Johnston, Fung, Datahacker, Hui, and Redmon teaches “The embedded device of claim 16, wherein the object detection system is implemented by a pipeline (Redmon page 2 left hand column paragraph 4 "This means our network reasons globally about the full image and all the objects in the image. The YOLO design enables end-to-end training and real time speeds while maintaining high average precision")
The combination of Johnston, Fung, Datahacker, and Hui does not explicitly teach all of “including a signal processing component and a machine learning component.”
However, Kwant teaches “a pipeline including a signal processing component (Kwant paragraph [0120] "Alternatively or in addition, the processor 1503 may include one or more microprocessors configured in tandem via the bus 1501 to enable independent execution of instructions, pipelining, and multithreading. The processor 1503 may also be accompanied with one or more specialized components to perform certain processing functions and tasks such as one or more digital signal processors (DSP) 1507, or one or more application-specific integrated circuits (ASIC) 1509") and a machine learning component (Kwant paragraph [0049] "predicted by the computer vision system 103 by extracting pixel features from the image data corresponding to each cell, and then using machine learning ( e.g., a neural network) to predict the edge, line segment, and/or centroid of an object or portion of the object that may be depicted in the object").”
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify system and method of object detection using an embedded device as taught by Johnston, Fung, Datahacker, Hui, and Redmon to implement a pipeline including a signal processing component and machine learning component as taught by Kwant.
The suggestion/motivation for doing so would have been that, " A DSP 1507 typically is configured to process real-world signals (e.g., sound) in real time independently of the processor 1503." as noted by the Kwant, paragraph 120. As well as one having ordinary skill in the art is knowledgeable about pipelining allows for a more scalable and efficient system for real-time decisions applicable to object detection.
Therefore, it would have been obvious to combine the disclosure of Johnston, Fung, Datahacker, Hui, Redmon with the Kwant disclosure to obtain the invention as specified in claim 25 as there is a reasonable expectation of success and/or because doing so merely combines prior art elements according to known methods to yield predictable results.
Reference Cited
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
“Fast YOLO: A Fast You Only Look Once System for Real-Time Embedded Object Detection in Video” disclosed by Shafiee discloses a YOLO object detection system optimized for use on an embedded device.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JASPREET KAUR whose telephone number is (571)272-5534. The examiner can normally be reached Monday - Friday 9:30 am - 5:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Amandeep Saini can be reached at (571)272-3382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JASPREET KAUR/Examiner, Art Unit 2662
/AMANDEEP SAINI/Supervisory Patent Examiner, Art Unit 2662