Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This action is responsive to patent application as filed on 5/13/2024 which claims foreign priority to Japanese patent application 2023-113115 filed 07/10/2023.
This action is made Non-Final.
Claims 1 – 5 are pending in the case. Claims 1 and 5 are independent claims.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 5/13/2024, is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Drawings
The drawings filed on 5/13/2024 have been accepted by the Examiner.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s) 1, 3 and 4 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Shambik (USPUB 20220027642 A1).
Claim 1:
Shambik discloses A vehicle position estimation system comprising: a camera; and processing circuitry configured to estimate a position of a target vehicle shown in an image captured by the camera (0006: a navigation system fora host vehicle may include at least one processor. The processor may be programmed to receive, from a camera of the host vehicle, at least one captured image representative of an environment of the host vehicle; and analyze one or more pixels of the at least one captured image to determine whether the one or more pixels represent a target vehicle, wherein at least a portion of the target vehicle is not represented in the at least one captured image. The processor may further be configured to determine an estimated distance from the host vehicle to the target vehicle, wherein the estimated distance is based at least in part on the portion of the target vehicle not represented in the at least one captured image), wherein the processing circuitry is configured to execute: extracting a universal feature point and a plurality of types of unique feature points from the captured image using a trained model generated in advance by machine learning, the universal feature point being a feature point independent of vehicle type, each of the plurality of types of unique feature points being a feature point corresponding to each of a plurality of applicable vehicle types (Fig 19, 0290-294: Process 1900 may include receiving a plurality of images acquired as one or more vehicles traverse the road segment (step 1905). Server 1230 may receive images from cameras included within one or more of vehicles 1205, 1210, 1215, 1220, and 1225. For example, camera 122 may capture one or more images of the environment surrounding vehicle 1205 as vehicle 1205 travels along road segment 1200... identifying, based on the plurality of images, at least one line representation of a road surface feature extending along the road segment (step 1910). Each line representation may represent a path along the road segment substantially corresponding with the road surface feature. For example, server 1230 may analyze the environmental images received from camera 122 to identify a road edge or a lane marking and determine a trajectory of travel along road segment 1200 associated with the road edge or lane marking. In some embodiments, the trajectory (or line representation) may include a spline, a polynomial representation, or a curve... identifying, based on the plurality of images, a plurality of landmarks associated with the road segment (step 1910). For example, server 1230 may analyze the environmental images received from camera 122 to identify one or more landmarks, such as road sign along road segment 1200. Server 1230 may identify the landmarks using analysis of the plurality of images acquired as one or more vehicles traverse the road segment... the navigation information may include a target trajectory for vehicles to travel along a road segment, and process 1900 may include clustering, by server 1230, vehicle trajectories related to multiple vehicles travelling on the road segment and determining the target trajectory based on the clustered vehicle trajectories, as discussed in further detail below. Clustering vehicle trajectories may include clustering, by server 1230, the multiple trajectories related to the vehicles travelling on the road segment into a plurality of clusters based on at least one of the absolute heading of vehicles or lane assignment of the vehicles; acquiring information on the vehicle type of the target vehicle; selecting a target unique feature point from the plurality of types of unique feature points according to the vehicle type of the target vehicle; and estimating the position of the target vehicle based on image coordinates of the universal feature point and the target unique feature point (0363-369: The resulting trained model may be used to analyze information associated with a pixel or cluster of pixels to identify whether it represents a target vehicle, whether it is on a face or edge of the vehicle, distances to an edge of a face, or other information... Based on the mapping of each pixel to boundaries or faces target vehicle 2710, the navigation system may more accurately determine a boundary of vehicle 2710, which may allow the host vehicle to accurately determine one or more appropriate navigation actions... The disclosed techniques may be particularly beneficial in situations where target vehicles occupy much if not all of a captured image (e.g., where at least one edge of the vehicle is not present to suggest the orientation of the bounding box), where reflections of a target vehicle are included in an image, where a vehicle is being carried on a trailer or carrier, etc... a trained neural network model may be used to determine that pixel 2824 is includes an edge of vehicle 2810. The trained model may also determine that pixel 2822 is on a face of vehicle 2810, similar to pixel 2722. The system may also determine one or more distances from pixel 2822 to an edge of the face of the vehicle, similar to distances 2732 and 2734, described above. In some embodiments, this estimated distance information may be used to define a boundary of the vehicle that is not included in the image. For example, pixel 2822 may be analyzed to determine an estimated distance from pixel 2822 to an edge of vehicle 2810 that is beyond the edge of the image frame. Accordingly, based on a combined analysis of the pixels associated with vehicle 2810 that do appear in image, an accurate boundary for vehicle 2810 may be determined. This information may be used for determining a navigation action for the host vehicle. For example, an unseen rear edge of a vehicle (e.g., a truck, bus, trailer, etc.) may be determined based on a portion of the vehicle that is within the image frame. Accordingly, the navigation system may estimate what clearance is required for the target vehicle to determine whether the host vehicle needs to brake or slowdown, move into an adjacent lane, speed up, etc... the disclosed techniques may be used to determine bounding boxes that represent vehicles that are not target vehicles and thus should not be associated with a bounding box for purposes of navigation determinations. In some embodiments, this may include vehicles that are being towed or carried by other vehicles. FIG. 29 is an illustration of an example image 2900 showing vehicles on a carrier, consistent with the disclosed embodiments. Image 2900 may be captured by a camera of a host vehicle, such as image capture device 120 of vehicle 200, as described above. In the example shown in FIG. 29, image 2900 may be a side-view image taken by a camera positioned on a side of vehicle 200. Image 2900 may include a carrier vehicle 2910 which may be carrying one or more vehicles 2920 and 2930. While carrier vehicle 2910 is shown as an automobile transport trailer, various other vehicle carriers may be identified. For example, carrier vehicle 2910 may include a flatbed trailer, a tow truck, a single-car trailer, a tilt car carrier, a goose-neck trailer, a drop deck trailer, a wedge trailer, an automobile shipment train car, or any other vehicle for transporting another vehicle... the neural network may be trained such that a pixel within vehicles 2920 or 2930 is associated with an edge of vehicle 2910 rather than an edge of vehicles 2920 or 2930. Thus, no boundary may be determined for the carried vehicles. Vehicles 2920 and 2930 may also be identified as vehicles being carried using other techniques, for example, based on the position of the vehicle relative to vehicle 2910, an orientation of the vehicle, the position of the vehicle relative to other elements in the image, etc).
Claim 3:
Shambik discloses each of the plurality of types of unique feature points includes a plurality of types of attitude-specific feature points classified according to vehicle attitude (0379: process 3100 may include determining an orientation of the at least a portion of the boundary generated relative to the target vehicle. As described above, the orientation may be determined based on the combined analysis of each of the pixels associated with the target vehicle, including one or more identified faces associated with the target vehicle. The determined orientation may be indicative of an action (or future action or state) that is being performed or that will be performed by the target vehicle. For example, the determined orientation may be indicative of a lateral movement relative to the host vehicle (e.g., a lane change maneuver) by the target vehicle or a maneuver by the target vehicle toward a path of the host vehicle. Process 3100 may further include determining a navigational action for the host vehicle based on the determined orientation of the at least a portion of the boundary and cause the vehicle to implement the determined navigational action), the processing circuitry is further configured to execute acquiring information on the vehicle attitude of the target vehicle, and the selecting the target unique feature point includes: selecting a unique feature point corresponding to the vehicle type of the target vehicle from the plurality of types of unique feature points; and selecting, as the target unique feature point, an attitude-specific feature point corresponding to the vehicle attitude of the target vehicle from the plurality of types of attitude-specific feature points included in the selected unique feature point (0361, 0365, 0376: For example, pixel 2722 may correspond to a particular portion of a bumper recognized by the system. Other pixels, such as pixels representing a license plate, tail light, tire, exhaust pipe, or the like may be associated with different estimated distances. Although not shown in FIG. 27, various other distances may be estimated, including a distance to a top edge of face 2730, distances to forward or rearward edges (depending on the orientation of vehicle 2710), distances to edge 2712 of vehicle 2710, or the like... the system may determine a boundary with an orientation that more accurately represents the orientation of the target vehicle. For example, based on a combined analysis of pixels within face 2730 of vehicle 2710 (e.g., pixel 2722), edges of face 2730 may be estimated. By identifying discrete faces of vehicle 2710 represented in the image, the system may more accurately orient bounding box 2720 such that it corresponds to an orientation of vehicle 2710... for vehicles detected from side-facing cameras, improper orientation of a bounding box may improperly indicate the vehicle is traveling toward the host vehicle (e.g., a cut-in scenario). The disclosed techniques may be particularly beneficial in situations where target vehicles occupy much if not all of a captured image (e.g., where at least one edge of the vehicle is not present to suggest the orientation of the bounding box), where reflections of a target vehicle are included in an image, where a vehicle is being carried on a trailer or carrier, etc... processing unit 110 may determine that pixel 2722 represents a face 2730 of vehicle 2710. Accordingly, one or more distances to an edge of face 2730, such as distances 2732 and 2734 may be determined. For example, the distance values may include a distance from a particular pixel to at least one of a forward edge, rearward edge, side edge, top edge, or bottom edge of the target vehicle).
Claim 4:
Shambik discloses each of the plurality of types of unique feature points includes a plurality of feature points to which a reliability that varies depending on vehicle attitude is given, the processing circuitry is further configured to execute acquiring information on the vehicle attitude of the target vehicle, and the selecting the target unique feature point includes: selecting, as the target unique feature point, a unique feature point corresponding to the vehicle type of the target vehicle from the plurality of types of unique feature points; and excluding one or more feature points whose the reliability is less than a threshold value from the plurality of feature points included in the target unique feature point based on the vehicle attitude of the target vehicle (0365, 0368-369 and 0421: based on a combined analysis of pixels within face 2730 of vehicle 2710 (e.g., pixel 2722), edges of face 2730 may be estimated. By identifying discrete faces of vehicle 2710 represented in the image, the system may more accurately orient bounding box 2720 such that it corresponds to an orientation of vehicle 2710. The improved accuracy of the orientation of the bounding box may improve the ability of the system to determine appropriate navigation action responses. For example, for vehicles detected from side-facing cameras, improper orientation of a bounding box may improperly indicate the vehicle is traveling toward the host vehicle (e.g., a cut-in scenario). The disclosed techniques may be particularly beneficial in situations where target vehicles occupy much if not all of a captured image (e.g., where at least one edge of the vehicle is not present to suggest the orientation of the bounding box), where reflections of a target vehicle are included in an image, where a vehicle is being carried on a trailer or carrier, etc...the disclosed techniques may be used to determine bounding boxes that represent vehicles that are not target vehicles and thus should not be associated with a bounding box for purposes of navigation determinations. In some embodiments, this may include vehicles that are being towed or carried by other vehicles. FIG. 29 is an illustration of an example image 2900 showing vehicles on a carrier, consistent with the disclosed embodiments... each of the pixels in the image may be analyzed to identify a boundary of the vehicles. For example, pixels associated with carrier vehicle 2910 may be analyzed to determine a boundary of carrier vehicle 2910, as described above. The system may also analyze pixels associated with vehicles 2920 and 2930. Based on the analysis of the pixels, the system may determine that vehicles 2920 and 2930 are not target vehicles and thus should not be associated with a bounding box... The image analysis layer may be configured to detect some or all of these features and segment individual pixels within the image into multiple classes. In some embodiments, the classification may include binary classifications, such as “car”/“not car,” in which each pixel is determined to either be associated with a vehicle or not).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 2 and 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shambik.
Claim 2:
Shambik teaches the trained model includes: an upper layer that receives the captured image as input; a universal feature point extraction layer that receives an output of the upper layer as input and outputs the universal feature point; and a plurality of unique feature point extraction layers corresponding to the plurality of applicable vehicle types, each of which receives the output of the upper layer as input and outputs a unique feature point according to the corresponding applicable vehicle type (0357, 0358, 0404: Such a unified image analysis framework, for example, may include a single image analysis layer that receives captured image frames as input, analyzes and characterizes the pixels associated with the captured image frames, and provides the characterized image frames as output. The characterized image frames can then be supplied to the multiple different functions/modules of the navigation system for generation and implementation of appropriate navigational actions based on the characterized image frames... each pixel of an image may be analyzed to determine whether the pixel is associated with certain types of objects or features in the environment of a host vehicle. For example, each pixel within an image or a portion of an image may be analyzed to determine whether they are associated with another vehicle in the environment of the host vehicle. Additional information may be determined for each pixel, such as whether the pixel corresponds to an edge of a vehicle, a face of a vehicle, etc... the image analysis layer may include a trained neural network that characterizes segments of the captured images according to a training protocol of the neural network), the upper layer and the universal feature point extraction layer have been trained using first training data, the first training data configured of a plurality of images showing various vehicles without specifying the vehicle type, and each of the plurality of unique feature point extraction layers has been trained using second training data, the second training data configured of a plurality of images showing vehicles of the corresponding applicable vehicle type (0420 and 0422: Once the training data set is generated, it can be provided to the neural network so the model may be validated. Once the neural network is trained and validated, it can then be used in a vehicle navigation system for analyzing and characterizing captured images on the fly for use in navigating the vehicle. For example, each image captured by a camera onboard a vehicle may be provided to the navigation system for characterization by the image analysis layer. This layer may recognize any or all features of interest included in the vehicle environment and represented in the corresponding captured images. Such features of interest, as noted above, may include any feature of an environment based on which one or more aspects of vehicle navigation may depend (e.g., vehicles, other objects, wheels, vegetation, roads, road edges, barriers, road lane markings, potholes, tar strips, traffic signs, pedestrians, traffic lights, utility infrastructure, free space regions, types of free space (parking lot, driveway, etc., which can be important in determining whether a non-road region is navigable in the case of an emergency for example), and many other features). The output of the image analysis layer of the trained neural network may be fully or partially designated image frames in which the pixels of all features of interests are designated. This output can then be provided to a plurality of different vehicle navigation functions/modules, and each individual module/function can rely upon the designated image frames to derive the information it needs for providing its corresponding navigational functionality... each pixel may be classified according to a list of predefined classes of objects or features. Such classes may include, but are not limited to, car, truck, bus, motorcycle, pedestrian, road, barrier, guardrail, painted road surface, elevated areas relative to road, drivable road surface, poles, or any other feature that may be relevant to a vehicle navigation system. FIG. 28B illustrates an image 3400B with classified regions of pixels, consistent with the disclosed embodiments. Image 3400B may correspond to image 3400A after it has been processed through the image analysis layer).
Shambik, by itself, does explicitly teach multiple layers (e.g., upper layer, universal feature point extraction layer, plurality of unique feature point extraction layers), however Shambik discusses a single image analysis layer that is utilized to complete all of the steps of the multiple layers of the claims. Shambik discusses the advantage of having a single layer over having multiple layers in 0424: The new trained image analysis layer may be, in effect, backwards compatible with previously developed modules/functions that do not rely upon or need the new feature. Those that can take advantage of the newly identified feature(s) by the newly trained image analysis layer can be modified or newly developed. In other words, new capabilities may be added without modifying large parts of the navigational system. And, by performing all of the image analysis and classification in a single image analysis layer, there are no function/module-specific image analysis algorithms that need to be developed. As a result, new navigational features based on newly identified categories or classes of features of interest in an environment of a vehicle may require little additional computational resources for implementation, especially from an image analysis perspective. The claim is therefore obvious in view of Shambik as the omission of a step is obvious if the function of the step is not desired. See MPEP 2144.04 II A; citing In re Kuhle, 526 F.2d 553, 188 USPQ 7 (CCPA 1975) (deleting a prior art switch member and thereby eliminating its function was an obvious expedient).
Claim 5:
Shambik teaches A generation method for generating a trained model for causing a computer to extract a feature point of a vehicle shown in a target image, wherein the trained model includes: an upper layer that receives the target image as input; a universal feature point extraction layer that receives an output of the upper layer as input and outputs the feature point of the vehicle; and a plurality of unique feature point extraction layers corresponding to a plurality of applicable vehicle types, each of which receives the output of the upper layer as input and outputs the feature point of the vehicle (0357, 0358, 0404: Such a unified image analysis framework, for example, may include a single image analysis layer that receives captured image frames as input, analyzes and characterizes the pixels associated with the captured image frames, and provides the characterized image frames as output. The characterized image frames can then be supplied to the multiple different functions/modules of the navigation system for generation and implementation of appropriate navigational actions based on the characterized image frames... each pixel of an image may be analyzed to determine whether the pixel is associated with certain types of objects or features in the environment of a host vehicle. For example, each pixel within an image or a portion of an image may be analyzed to determine whether they are associated with another vehicle in the environment of the host vehicle. Additional information may be determined for each pixel, such as whether the pixel corresponds to an edge of a vehicle, a face of a vehicle, etc... the image analysis layer may include a trained neural network that characterizes segments of the captured images according to a training protocol of the neural network), and the generation method includes: training the upper layer and the universal feature point extraction layer by using first training data, the first training data configured of a plurality of images showing various vehicles without specifying vehicle type; and training each of the plurality of unique feature point extraction layers by using second training data, the second training data configured of a plurality of images showing vehicles of the corresponding applicable vehicle type (0420 and 0422: Once the training data set is generated, it can be provided to the neural network so the model may be validated. Once the neural network is trained and validated, it can then be used in a vehicle navigation system for analyzing and characterizing captured images on the fly for use in navigating the vehicle. For example, each image captured by a camera onboard a vehicle may be provided to the navigation system for characterization by the image analysis layer. This layer may recognize any or all features of interest included in the vehicle environment and represented in the corresponding captured images. Such features of interest, as noted above, may include any feature of an environment based on which one or more aspects of vehicle navigation may depend (e.g., vehicles, other objects, wheels, vegetation, roads, road edges, barriers, road lane markings, potholes, tar strips, traffic signs, pedestrians, traffic lights, utility infrastructure, free space regions, types of free space (parking lot, driveway, etc., which can be important in determining whether a non-road region is navigable in the case of an emergency for example), and many other features). The output of the image analysis layer of the trained neural network may be fully or partially designated image frames in which the pixels of all features of interests are designated. This output can then be provided to a plurality of different vehicle navigation functions/modules, and each individual module/function can rely upon the designated image frames to derive the information it needs for providing its corresponding navigational functionality... each pixel may be classified according to a list of predefined classes of objects or features. Such classes may include, but are not limited to, car, truck, bus, motorcycle, pedestrian, road, barrier, guardrail, painted road surface, elevated areas relative to road, drivable road surface, poles, or any other feature that may be relevant to a vehicle navigation system. FIG. 28B illustrates an image 3400B with classified regions of pixels, consistent with the disclosed embodiments. Image 3400B may correspond to image 3400A after it has been processed through the image analysis layer).
Shambik, by itself, does explicitly teach multiple layers (e.g., upper layer, universal feature point extraction layer, plurality of unique feature point extraction layers), however Shambik discusses a single image analysis layer that is utilized to complete all of the steps of the multiple layers of the claims. Shambik discusses the advantage of having a single layer over having multiple layers in 0424: The new trained image analysis layer may be, in effect, backwards compatible with previously developed modules/functions that do not rely upon or need the new feature. Those that can take advantage of the newly identified feature(s) by the newly trained image analysis layer can be modified or newly developed. In other words, new capabilities may be added without modifying large parts of the navigational system. And, by performing all of the image analysis and classification in a single image analysis layer, there are no function/module-specific image analysis algorithms that need to be developed. As a result, new navigational features based on newly identified categories or classes of features of interest in an environment of a vehicle may require little additional computational resources for implementation, especially from an image analysis perspective. The claim is therefore obvious in view of Shambik as the omission of a step is obvious if the function of the step is not desired. See MPEP 2144.04 II A; citing In re Kuhle, 526 F.2d 553, 188 USPQ 7 (CCPA 1975) (deleting a prior art switch member and thereby eliminating its function was an obvious expedient).
Note
The Examiner cites particular columns, line numbers and/or paragraph numbers in the references as applied to the claims below for the convenience of the Applicant(s). Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the Applicant fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner. See MPEP 2123.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure and is listed in the attached PTOL-892 form.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMED-IBRAHIM ZUBERI whose telephone number is (571)270-7761. The examiner can normally be reached on M-Th 8-6 Fri: 7-12/OFF.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Steph Hong can be reached on (571) 272-4124. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MOHAMMED H ZUBERI/ Primary Examiner, Art Unit 2178