DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending.
Claim Rejections - 35 USC § 103
The following is a quotation of pre-AIA 35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.
Claim(s) 1-3, 9-11 and 17-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Divakaran et al (US20140347475) in view of Laflamme (US20070171431).
Regarding claims 1, 9 and 17, Divakaran in view of Laflamme teaches a system comprising: at least one processor; and memory having instructions stored thereon that, when executed by the at least one processor, cause the system to execute a pipeline to:
obtain, in an automated roadway asset management software, image data of a physical scene along a section of a roadway,
(Divakaran, "recording two-dimensional (2D) or three-dimensional (3D) video images of portions of the real-world environment", [0024]; Laflamme, "automated extraction software", [0018]; "inventory of roadside infrastructure assets", [0017]; Divakaran teaches obtaining video/image data of a physical scene. Laflamme is incorporated to teach adapting this system specifically for automated extraction software used to manage roadway infrastructure assets)
The combination of Divakaran and Laflamme further teaches:
wherein the image data comprises still images or video frames acquired of a first section of a roadway physical scene at a first time instance and a second section of the roadway physical scene at a second time instance by a camera moving along the roadway mounted on a sensing vehicle instructed to acquire still images or video frames along the section of the roadway;
(Divakaran, "mobile cameras (such as cameras that are integrated with mobile electronic devices, such as smart phones, tablet computers, wearable electronic devices, aerial vehicles (AVs), unmanned aerial vehicles (UAVs), robots, and/or others)", [0024]; Laflamme, "One or more cameras 22 and at least one laser scanning device 24 are mounted on a road vehicle 20, such as a mini-van, in order to acquire data as the vehicle advances at traffic speed.", [0019]; Divakaran teaches the use of mobile cameras capturing environments. Laflamme specifically teaches mounting these cameras on a road vehicle moving along a roadway to acquire the data)
detect, in an automated roadway asset management software executing a trained machine learning model, a plurality of roadway objects in the image data, including a contiguous roadway object that appears in the first section of the roadway physical scene and in the second section of the roadway physical scene across a plurality of the still images or video frames,
(Divakaran, "Machine learning algorithms are often used for image recognition.", [0003]; "learning techniques to tracking data output by the tracking module 222 (e.g., the track stream 224) over time (e.g., training data).", [0041]; "similar appearance or motion is detected in a sequence of multiple frames of the video stream", [0059]; Laflamme, "guard-rail", [0007]; "objects that span along the road over certain parametric distances", [0026]; Divakaran teaches the execution of machine learning algorithms based on training data to detect and track objects across a plurality of frames. Laflamme is incorporated to apply this detection to contiguous roadway objects that span distances, such as guard rails)
wherein each of the plurality of roadway objects is detected from among a set of predefined roadway objects on which the trained machine learning model is trained to detect,
(Divakaran, "similarity of the detected feature to a training data set", [0030]; Laflamme, "a database of predefined road signs", [0031]; Divakara teaches recognizing objects based on similarity to a predefined training data set. Laflamme confirms the use of a predefined database of objects to match and recognize detected structures)
wherein the trained machine learning model has been trained with labeled data for a retaining wall, a noise barrier, a rumble strip, an anchor, a guard rail;
(Laflamme, "guard-rail", [0007]; "For certain objects, such as guardrails, attachment point types need to be recognized.", [0028]; Laflamm explicitly teaches targeting and recognizing guard rails. Divakara teaches utilizing training data, rendering the combination obvious for detecting guard rails)
determine, in an automated roadway asset management software, a position of each the plurality of detected roadway objects in the plurality of still images or video frames, including positions of the contiguous roadway object in the first section of the roadway physical scene and the second section of the roadway physical scene;
(Divakaran, "tracking module 222 analyzes temporal sequences of frames in the video stream 216 in order to track the movement of detected persons or objects over time.", [0039]; Laflamme, [0026]; "locating the object", [0008]; "For objects that span along the road over certain parametric distances, extremity points are computed instead of a unique centroid.", [0026]; Divakaran teaches determining object positions across sequences of frames. Laflamme teaches computing the specific extremity positions of contiguous linear roadway objects spanning the scene)
track and join, in an automated roadway asset management software executing a computer-vision algorithm, a first determined position of each roadway object in a first one of the plurality of still images or video frames and a second determined position of the each roadway object in a second one of the plurality of still images or video frames to generate a multi-frame representation of each respective roadway object of the plurality of roadway objects;
(Divakaran, "Individual detections are associated together over time to form a track for a particular tracked object. The associations produce short segments of associated detections, which may also be called tracklets ... tracklets are associated with other tracklets across larger frame gaps to form longer tracklets.", [0064]; tracking and joining determined positions of objects across multiple frames to generate continuous representations (tracklets/tracks))
determine, in an automated roadway asset management software, a number of instances of the plurality of roadway objects over a predefined region of the roadway in the image data of the physical scene, including at least a number of unique instances of retaining walls, a number of unique instances of noise barriers, a number of unique instances of rumble strips, a number of unique instances of anchors, a number of unique instances of guard rails,
(Laflamme, "regrouping objects of interest of a same type, such as road sign, tree, pole, guardrail, etc", [0025]; "... a database of stored GIS layers. The layers can be stored, for example, in a relational database where a table corresponds to a layer or to a class of objects.", [0018]; regrouping identical types of items (such as guard rails) into classes within a database, which functionally determines the count/inventory of unique instances of that roadway object over the mapped region)
wherein joined roadway object are incremented as one to avoid double counting; and
(Divakaran, "JPDA can also produce fused track geo-locations in the overlapping zone so that only one set of geo-locations are reported globally.", [0029]; Laflamme, "aggregating proximal laser scanned points into a single object", [0008]; Divakaran teaches fusing overlapping tracks so they are reported globally as one entity, explicitly avoiding double counting. Laflamme confirms this by aggregating proximal points into single items)
output, via a user interface of the automated roadway asset management software, an indication of the determined number of instances of the plurality of roadway objects in the image data,
(Divakaran, "OT GUI 154 may be embodied as any suitable form of human-computer interface, such as a user interface for a workstation display", [0035]; Laflamme, "The database is connected to the automated extraction software and is populated in real time during the extraction process.", [0018]; Divakaran teaches outputting system tracking data via a graphical user interface. Laflamme teaches populating and exposing the database of extracted roadway objects)
wherein the determined number of instances is used for inventory management of assets at the roadway.
(Laflamme, "build the equipment inventory of their infrastructure network", [0003]; "inventory of roadside infrastructure assets", [0017]; using the system's extracted data to build and manage an equipment inventory for roadside infrastructure)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to incorporate Laflamme into the object detection and tracking system of Divakaran in order to adapt the real-time object detection and tracking system to the automated inventory management of roadside infrastructure assets (such as guard rails), utilizing vehicle mounted cameras moving along the roadway at traffic speeds to efficiently populate a geographic information system (GIS) database. The combination of Divakaran and Laflamme also teaches other enhanced capabilities.
Regarding claims 2, 10 and 18, the combination of Divakaran and Laflamme teaches its/their respective base claim(s).
The combination further teaches the system of claim 1, wherein the training and joining the first determined position and the second determined position includes to:
detect a discontinuity in the multi-frame representation of the plurality of roadway objects between the first one of the plurality of still images or video frames and the second one of the plurality of still images or video frames; and
(Divakaran, “The illustrative scene awareness module 210 computes static and dynamic occlusion regions in the individual frames of the video stream 216 (which depicts the real-world environment in a field of view of a camera 112, 114, 116), and generates the static and dynamic occlusion maps 212, 214, accordingly”, “During static modeling, static occluders (such as poles and trees) detected in the scene are marked with static masks ... dynamic occluders (such as vehicles) are marked with dynamic masks ... The occlusion zones detected and mapped by the scene awareness module 210 are used in occlusion reasoning for both detection and tracking”, [0040]; occlusion detection and reasoning across video frames using multi-frame analysis modules (scene awareness, occlusion maps); handling of “static and dynamic occlusions” (discontinuities in object appearance across sequenced frames as persons/objects move through camera fields)
remove a still image or video frame associated with the discontinuity.
(Divakaran, “In the tracking context, the occlusion reasoning engine 316 is configured to determine whether to suspend or terminate tracking of a person or object. When a track is temporarily terminated by the tracking module 222, e.g., due to a large or persistent occlusion, the reacquisition module 610 is later executed to link a new track with the terminated track when the object or person reappears in a field of view of one of the cameras of the system 100”, [0058]; suspend/terminate tracking when occlusions (i.e., discontinuities) persist, including the ability to ignore frames with occluded/missing objects and only initiate or continue tracks if detection in non-occluded regions resumes, which in practical system flow means disregarding frames during discontinuity for tracking continuity)
Regarding claims 3, 11 and 19, the combination of Divakaran and Laflamme teaches its/their respective base claim(s).
The combination further teaches the system of claim 1, wherein the instructions further cause the system to:
determine a height value of each the plurality of roadway objects from the image data; and
output, via the user interface, an indication of the determined height value, wherein the determined height value is used for the inventory management of assets at the roadway.
(Laflamme, Fig. 3; “The coordinates X, Y, Z are computed for the points of interest using angle, range, and global positioning information. Points are then filtered once again, this time based on their height from the ground. The filtered points are re-grouped using a proximity filter to identify points that belong to a same object. The objects that are identified are estimated for size and a centroid (unique object location) is computed”, [0030]; “Object measurement is done by computing the extent (or bounding box) of the aggregated set of points for each object. Using laser orientation, frequency, scans per second, and/or traveling speed, a threshold is used to best approximate object size”, [0027]; determining the height value of roadway-side objects. Laser and image data are combined to determine 3D (X,Y,Z) coordinates, with Z corresponding to height from ground for each detected asset)
Allowable Subject Matter
Claim(s) 4-8, 12-16 and 20 is/are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening Claim(s).
The following is a statement of reasons for the indication of allowable subject matter:
Claim(s) 4, 12 and 20 recite(s) limitation(s) related to CNN backbone with FPN extracts multi-scale features; GRoIE with global-context attention; heads output bounding boxes, classes, and segmentation masks; multi-scale CNN + GRoIE global context, outputs bounding boxes, classifications, and per-pixel segmentation masks; multi-scale CNN + GRoIE global context, bounding boxes, classes, segmentation masks. There are no explicit teachings to the above limitation(s) found in the prior art cited in this office action and from the prior art search.
Claim(s) 5-8 and Claims 13-16 depend on claims 4 and 12, respectively.
Response to Arguments
Applicant's arguments filed on 3/5/2026 with respect to one or more of the pending claims have been fully considered but they are not persuasive.
Regarding claim(s) 1, 9 and 17, Applicant, in the remarks, argues that the combination of the cited reference(s) fails to teach the newly amended limitations in the claims.
The Examiner respectfully disagreed. The office action has been updated to address applicant’s argument. See the updated review comments for details.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JIANXUN YANG whose telephone number is (571)272-9874. The examiner can normally be reached on MON-FRI: 8AM-5PM Pacific Time.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Amandeep Saini can be reached on (571)272-3382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center. for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272- 1000.
/JIANXUN YANG/
Primary Examiner, Art Unit 2662 5/2/2026