Office Action Analysis: 18825745 — SYSTEMS AND METHODS FOR EDGE-DRIVEN OBJECT DETECTION FOR RESOURCE OPTIMIZATION

Office Action

§102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 09/05/2024 is being considered by the examiner.
Specification
35 U.S.C. 112(a) or pre-AIA  35 U.S.C.  112, requires the specification to be written in “full, clear, concise, and exact terms.” The specification is replete with terms which are not clear, concise and exact. The specification should be revised carefully in order to comply with 35 U.S.C. 112(a) or pre-AIA  35 U.S.C.  112. Examples of some unclear, inexact or verbose terms used in the specification are: Terms that are cited in the Detailed Description a couple of times, are not cited throughout the rest of paragraphs in the Detailed Description. In paragraph 0048, it discloses and labels “Object detection data” with reference number 327. Then, further down in the disclosure and on, it mentions “Object detection data” multiple times, with label, 327 and 407. Then, in paragraph 0056 and further on, it fails to properly label “Object detection data”. As well as many other paragraphs without a proper label for “Object detection data” when mentioned. This goes for many different terms throughout the detailed disclosure that do not have a proper label reference, but is referenced from the drawings. Furthermore, inconsistent label numberings for terms that were previously labeled. For example, “one or more processors” is referenced by different numbers, 304, 204. This also goes for many different terms throughout the detailed disclosure that have inconsistent reference numbering. Further revision is required, additional typos should be fixed accordingly.
The disclosure is objected to because of the following informalities: 
In detailed description, “one or more processors” is referenced by numbers 304 and 204
In detailed description, “one or more memory components” is referenced by numbers 302 and 202
In detailed description, “server network interface hardware” is referenced by numbers 306 and 206
In detailed description, “data storage component” is referenced by numbers 307 and 207
In detailed description, “a mixed reality environment 403” and “simulated-3D view 403” are referenced by the same number 403.
Appropriate correction is required.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1, 2, 9, 10, 12, 13, 14, 15 and 19 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by GRUTESER (No. US-20210110191-A1 “Gruteser”).
Regarding claim 1, Gruteser teaches “A system for reducing latency and bandwidth usage in reality devices comprising:” (System; Para 0046); (save bandwidth and thereby reduce latency; Para 0046); (AR device; Para 0046);
“a reality device comprising a camera to operably capture a frame of a view external to a vehicle; and” (a frame captured by the camera from the AR device; Para 0056); (detecting surrounding vehicles; Para 0032);
“one or more processors operable to:” (one processor being dedicated to the functions of the AR device; Para 0071);
“send the frame to an edge server;” (frame captures performed by the AR device 12, as well as the transmission to and processing of such frames by the edge cloud device 14; Para 0048); (Fig. 1, shows capture frame n of the AR device to receive frame n in the edge cloud);
“receive object detection data from the edge server, wherein the object detection data comprises object information in the frame; and” (object detection inference on the frame; Para 0056); (Fig. 1 shows after receiving frame n, then finish inference of frame n, then received detection results);
“instruct the reality device to render a mixed reality environment with the object detection data.” (the system requires only 2.24 ms latency and less than 15% resources on the AR device, which leaves the remaining time between frames to render high quality virtual elements for high quality AR/MR experience; Para0047; object detection, Para 0056); (the AR device to render high-quality virtual overlays; Para 0046);
Regarding claim 2, Gruteser teaches “The system of claim 1, wherein a size of the object detection data is smaller than a size of the frame.” (reduces the size of encoded frames with only a small tradeoff in object detection accuracy; Para 0070);
(regions of interest which are potential objects locations in the frame; Para 0073);
(Fig. 4A-4C showcase objects and object detection data that is smaller than the frame of the image);
Regarding claim 9, Gruteser teaches “The system of claim 1, wherein the one or more processors are further operable to superimpose the object detection data onto a real-world view.” (the AR device, which leaves the remaining time between frames to render high quality virtual elements for high quality AR/MR experience; Para 0047);
(Fig. 4A-4C showcase the object detection data of the objects in the real-world view.)
Regarding claim 10, Gruteser teaches “The system of claim 9, wherein the object detection data are superimposed onto a vision of a user or a current frame.” (the AR device to render high-quality virtual overlays; Para 0046);
(the RoI can be derived as the area that a user chooses to look at; 0072);
(Fig. 4A-4C showcase the object detection data of the objects in the current frame.)
Regarding claim 12, Gruteser teaches “A server for reducing latency and bandwidth usage in reality devices comprising:” (the server; Para 0094); (save bandwidth and thereby reduce latency; Para 0046); (AR device; Para 0046);
“one or more processors operable to:” (one processor being dedicated to the functions of the AR device; Para 0071); 
“receive a frame of a view external to a vehicle from a reality device comprising a camera to operably capture the frame;” (a frame captured by the camera from the AR device; Para 0056);
(frame captures performed by the AR device 12, as well as the transmission to and processing of such frames by the edge cloud device 14; Para 0048); (Fig. 1, shows capture frame n of the AR device to receive frame n that was captured by the camera to the edge cloud); (detecting surrounding vehicles; Para 0032);
“generate object detection data, wherein the object detection data comprises information about objects in the frame; and” (object detection inference on the frame; Para 0056); (Fig. 1 shows after receiving frame n, then finish inference of frame n, then received detection results);
“send the object detection data to the reality device to render a mixed reality environment with the object detection data by the reality device.” (Fig. 1 Showcases T3 sending the detection results of object to T4 then T5 to be render in the MR.); 
(the system requires only 2.24 ms latency and less than 15% resources on the AR device, which leaves the remaining time between frames to render high quality virtual elements for high quality AR/MR experience; Para0047); (the AR device to render high-quality virtual overlays; Para 0046);
Regarding claim 13, Gruteser teaches “The server of claim 12, wherein the server sends the object detection data without sending the frame to the reality device.” (the detection result is sent back to the AR device 12; Para 0059);
Regarding claim 14, Gruteser teaches “A method for reducing latency and bandwidth usage in reality devices comprising:” (methods; Para 0025); (save bandwidth and thereby reduce latency; Para 0046); (AR device; Para 0046); 
“sending a frame of a view external to a vehicle to an edge server, wherein the frame is captured by a reality device;” (frame captures performed by the AR device 12, as well as the transmission to and processing of such frames by the edge cloud device 14; Para 0048); (Fig. 1, shows capture frame n of the AR device to receive frame n in the edge cloud); (detecting surrounding vehicles; Para 0032);
“receiving object detection data from the edge server, wherein the object detection data comprises object information in the frame; and” (object detection inference on the frame; Para 0056); (Fig. 1 shows after receiving frame n, then finish inference of frame n, then received detection results); 
“instructing the reality device to render a mixed reality environment with the object detection data.” (the system requires only 2.24 ms latency and less than 15% resources on the AR device, which leaves the remaining time between frames to render high quality virtual elements for high quality AR/MR experience; Para0047); (the AR device to render high-quality virtual overlays; Para 0046);
Regarding claim 15, Gruteser teaches “The method of claim 14, wherein a size of the object detection data is smaller than a size of the frame.” (reduces the size of encoded frames with only a small tradeoff in object detection accuracy; Para 0070);
(regions of interest which are potential objects locations in the frame; Para 0073);
(Fig. 4A-4C showcase objects and object detection data that is smaller than the frame of the image);
Regarding claim 19, Gruteser teaches “The method of claim 14, wherein the method further comprises superimposing the object detection data onto a real-world view.” (the AR device, which leaves the remaining time between frames to render high quality virtual elements for high quality AR/MR experience; Para 0047);
(Fig. 4A-4C showcase the object detection data of the objects in the real-world view.)

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 3, 4, 5, 7, 8, 11, 17, 18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over GRUTESER in view of TRAN (No. US-10816993-B1 “Tran”).
Regarding claim 3, while Gruteser fails to teach all of claim 3, Tran teaches “The system of claim 1, wherein the object detection data comprises box cords of detected objects in the frame, confidence of each corresponding box cord, and object information of each detected object.” (bounding box involving the x, y coordinate and the width and height and the confidence; Col 31, Line 25-26); 
(As noted above, cameras can still be used to detect short range objects/ symbols useful for navigation. For example, objects can include pavement markings which are used to convey messages to roadway users and to the camera and vision system. They indicate which part of the road to use, provide information; Col, 31, Line 33-38);
Tran discloses a bounding box that has coordinates, confidence and other information of the detect object the bounding box surrounds. This relates to the claimed box cords of detect objects.
Gruteser and Tran are analogous art as both of them are related to object detection.
The motivation for the above is to have more accurate data of the detected object in the frame.
Therefore, it would have been obvious for an ordinary skilled person in the art before the effective filing date of claimed invention to have modified Gruteser by the object detection data comprising box cords of detected objects in the frame, confidence of each corresponding box cord, and object information of each detected object as taught by Tran.
Regarding claim 4, while Gruteser fails to teach all of claim 4, Tran teaches “The system of claim 3, wherein each box cord comprises coordinates of three or more vertices of the corresponding detected object.” (at least three points and likewise identifies the corresponding position vectors in the map; Col 17, Line 19-21);
((at least 3 non – collinear points) on the sign then the HD map has enough data and can continue; Col 23, Line 44-46);
Tran discloses three points or position vectors that correspond to the detected object box coordinates.
The motivation for the above is to have more accurate coordinate vertices that surround the detect object.
Therefore, it would have been obvious for an ordinary skilled person in the art before the effective filing date of claimed invention to have modified Gruteser by the box cord comprising coordinates of three or more vertices of the corresponding detected object as taught by Tran.
Regarding claim 5, while Gruteser fails to teach all of claim 5, Tran teaches “The system of claim 3, wherein the confidence is between 0 and 1.” (probabilities, a number between 0 and 1); 
Tran discloses the probabilities being between 0 and 1 which correlates to the confidence of the bounding box confidence.
The motivation for the above is to have accurate confidence value when detecting the object.
Therefore, it would have been obvious for an ordinary skilled person in the art before the effective filing date of claimed invention to have modified Gruteser by the confidence being between 0 and 1 as taught by Tran.
Regarding claim 7, while Gruteser fails to teach all of claim 7, Tran teaches “The system of claim 3, wherein the object information comprises a class of the detected object.” (Object detection combines these two tasks and draws a bounding box around each object of interest in the sensor output and assigns them a class label; Col 29, Line 31-33);
(determining a classification and a state of the detected object; Col 46, Line 1-2);
The motivation for the above is to have accurate classification of the detect objects for better user usage.
Therefore, it would have been obvious for an ordinary skilled person in the art before the effective filing date of claimed invention to have modified Gruteser by the object information comprising a class of the detected object as taught by Tran.
Regarding claim 8, while Gruteser fails to teach all of claim 8, Tran teaches “The system of claim 3, wherein the object information of each detected object is associated with the corresponding box cord as an annotation.” (The 3D points that project within the image bounding box created by the sign's vertices are considered sign points. These 3D points are used to fit a plane, wherein the HD map projects the sign's image vertices onto that 3D plane to find the 3D coordinates of the sign's vertices. At which point the HD map has all of the information to describe a sign: its location in 3D space, its orientation described by its normal and the type of sign produced from classifying the sign in the image; Col 19, Line 56-64);
Tran discloses sign points that are created from the bounding boxes vertices. This relates to the box cords as an annotation since the sign points are positioned near the vertices of the bounding box of the detected object.
The motivation for the above is to have accurate annotation of the detected object coordinate for better user usage.
Therefore, it would have been obvious for an ordinary skilled person in the art before the effective filing date of claimed invention to have modified Gruteser by the object information of each detected object which is associated with the corresponding box cord as an annotation as taught by Tran.
Regarding claim 11, while Gruteser fails to teach all of claim 11, Tran teaches “The system of claim 1, wherein the one or more processors are further operable to autonomously drive the vehicle based on the object detection data.” (The HD map stores objects or data structures representing lane elements that comprise information representing geometric boundaries of the lanes; driving direction along the lane; vehicle restriction for driving in the lane, for example, speed limit, relationships with connecting lanes including incoming and outgoing lanes; a termination restriction, for example, whether the lane ends at a stop line, a yield sign, or a speed bump; and relationships with road features that are relevant for autonomous driving, for example, traffic light locations, road sign locations and so on; Col 19, Line 1-11);
Tran discloses autonomous driving based on the detection data from lane and road information, this relates to autonomous driving a vehicle of the claimed subject matter. 
The motivation for the above is to have an accurate vehicle driving system and to have reliable data to have an autonomous driving vehicle.
Therefore, it would have been obvious for an ordinary skilled person in the art before the effective filing date of claimed invention to have modified Gruteser by operations to autonomously drive the vehicle based on the object detection data as taught by Tran.
Regarding claim 17, while Gruteser fails to teach all of claim 17, Tran teaches “The method of claim 16, wherein: 
each box cord comprises coordinates of three or more vertices of the corresponding detected object; and” (at least three points and likewise identifies the corresponding position vectors in the map; Col 17, Line 19-21);
((at least 3 non – collinear points) on the sign then the HD map has enough data and can continue; Col 23, Line 44-46);
“the confidence is between 0 and 1.” (probabilities, a number between 0 and 1);
The motivation for the above is to have more accurate coordinate vertices that surround the detect object and accurate confidence value when detecting the object.
Therefore, it would have been obvious for an ordinary skilled person in the art before the effective filing date of claimed invention to have modified Gruteser by the box cord comprising coordinates of three or more vertices of the corresponding detected object and the confidence is between 0 and 1 as taught by Tran.
Regarding claim 18, while Gruteser fails to teach all of claim 18, Tran teaches “The method of claim 16, wherein: 
the object information comprises a class of the detected object; and” (Object detection combines these two tasks and draws a bounding box around each object of interest in the sensor output and assigns them a class label; Col 29, Line 31-33);
(determining a classification and a state of the detected object; Col 46, Line 1-2);
“the object information of each detected object is associated with the corresponding box cord as an annotation.” (The 3D points that project within the image bounding box created by the sign's vertices are considered sign points. These 3D points are used to fit a plane, wherein the HD map projects the sign's image vertices onto that 3D plane to find the 3D coordinates of the sign's vertices. At which point the HD map has all of the information to describe a sign: its location in 3D space, its orientation described by its normal and the type of sign produced from classifying the sign in the image; Col 19, Line 56-64);
The motivation for the above is to accurate classification of the detect objects and have accurate annotation of the detected object coordinate for better user usage.
Therefore, it would have been obvious for an ordinary skilled person in the art before the effective filing date of claimed invention to have modified Gruteser by the object information comprising a class of the detected object and the object information of each detected object is associated with the corresponding box cord as an annotation as taught by Tran.
Regarding claim 20, while Gruteser fails to teach all of claim 20, Tran teaches “The method of claim 14, wherein the method further comprises autonomously driving the vehicle based on the object detection data.” (The HD map stores objects or data structures representing lane elements that comprise information representing geometric boundaries of the lanes; driving direction along the lane; vehicle restriction for driving in the lane, for example, speed limit, relationships with connecting lanes including incoming and outgoing lanes; a termination restriction, for example, whether the lane ends at a stop line, a yield sign, or a speed bump; and relationships with road features that are relevant for autonomous driving, for example, traffic light locations, road sign locations and so on; Col 19, Line 1-11);
The motivation for the above is to have an accurate vehicle driving system and to have reliable data to have an autonomous driving vehicle.
Therefore, it would have been obvious for an ordinary skilled person in the art before the effective filing date of claimed invention to have modified Gruteser by further comprising autonomously driving the vehicle based on the object detection data as taught by Tran.
Claim(s) 6 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over GRUTESER in view of TRAN in further view of XU (No. US-10997433-B2 “Xu”).
Regarding claim 6, while Gruteser and Tran fail to teach all of claim 17, Xu teaches “The system of claim 3, wherein the object detection data further comprise a cropped image path.” (The ROI images may represent a cropped image (e.g., center crop, right crop, left crop, half size, etc.). The cropped image may include a portion of a polygon (e.g., a polygon from the annotations of the original image) representing a lane or boundary outside; Col 18, Line 25-29);
Xu discloses a cropped image that comprises crop and polygon of a boundary box, this relates to the cropped image path of the claimed subject matter.
Gruteser, Tran and Xu are analogous art as they are all related to object detection.
The motivation for the above is to have a user-friendly information image of the detected object the user can refer back too.
Therefore, it would have been obvious for an ordinary skilled person in the art before the effective filing date of claimed invention to have modified Gruteser and Tran by the object detection data further comprising a cropped image path as taught by Xu.
Regarding claim 16, while Gruteser fails to teach all of claim 17, Tran teaches “The method of claim 14, wherein the object detection data comprises box cords of detected objects in the frame, confidence of each corresponding box cord, object information of each detected object,” (bounding box involving the x, y coordinate and the width and height and the confidence; Col 31, Line 25-26); 
(As noted above, cameras can still be used to detect short range objects/ symbols useful for navigation. For example, objects can include pavement markings which are used to convey messages to roadway users and to the camera and vision system. They indicate which part of the road to use, provide information; Col, 31, Line 33-38);
However, Tran fails to teach “a cropped image path”.
Xu teaches “a cropped image path.” (The ROI images may represent a cropped image (e.g., center crop, right crop, left crop, half size, etc.). The cropped image may include a portion of a polygon (e.g., a polygon from the annotations of the original image) representing a lane or boundary outside; Col 18, Line 25-29);
The motivation for the above is to have more accurate data of the detected object in the frame and a user-friendly information image of the detected object the user can refer back too.
Therefore, it would have been obvious for an ordinary skilled person in the art before the effective filing date of claimed invention to have modified Gruteser by the object detection data comprising box cords of detected objects in the frame, confidence of each corresponding box cord, and object information of each detected object as taught by Tran and to have modified Gruteser and Tran by a cropped image path as taught by Xu.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US-9659380-B1 (Castellani) – Discloses tracking positions of object(s) in video frames, including: processing an initial frame of a set of frames of the video frames using feature extraction to identify locations of features of the object(s), obtaining a next frame of the set and applying a motion estimation algorithm as between the next frame and a prior frame to identify updated locations of the features in the next frame.
US-11682272-B2 (Avadhanam) – Discloses a vehicle or device for pedestrian crossing warning system that may use multi-modal technology to determine attributes of a person and provide a warning to the person in response to a calculated risk level to affect a reduction of the risk level. Can utilize sensors to receive data indicative of a trajectory of a person external to the vehicle.
US-9741169-B1 (Holz) – Discloses capabilities to view and/or interact with the real world to the user of a wearable (or portable) device allowing for a smooth transition between an immersive virtual environment and a convergent physical real environment during an augmented hybrid experience.
WO-2020205597-A1 (MOUSTAFA) – Discloses a vehicle configured to sense information about the environment. The vehicle may use the sensed information to navigate through the environment.
WO-2022023789-A1 (MCLACHLAN) – Discloses electronic device in an edge cloud of a mobile network supports extended reality overlay placement for an object having a location in the real world.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRIGITER D PROTAZI whose telephone number is (571)272-7995. The examiner can normally be reached Monday - Friday 7:30-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Said A Broome can be reached at 5712722931. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/B.D.P./Examiner, Art Unit 2612                                                                                                                                                                                                        
/Said Broome/Supervisory Patent Examiner, Art Unit 2612
Read full office action
SYSTEMS AND METHODS FOR EDGE-DRIVEN OBJECT DETECTION FOR RESOURCE OPTIMIZATION

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

SYSTEMS AND METHODS FOR EDGE-DRIVEN OBJECT DETECTION FOR RESOURCE OPTIMIZATION

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email