DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Election/Restrictions
Applicant’s election of Invention III (claims 9-12) in the reply filed on 2/10/2026 is acknowledged. Because applicant did not distinctly and specifically point out the supposed errors in the restriction requirement, the election has been treated as an election without traverse (MPEP § 818.01(a)). The application has pending claims 1-20 (withdrawn claims 2-8, 13-14, and 16-18 are withdrawn from further consideration). Thus, claims 1, 9-12, 15, and 19-20 are examined as detailed below.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 11–12, 15 and 20 are rejected under 35 U.S.C. §103 as obvious over Guo (Guo et al, US 2020/0202196 A1, 2020).
Regarding claim 1, Guo teaches a method for object classification comprising:
receiving, by a processing circuit, a ([0026-0027] & [0030] & [0066]: Guo teaches an on-board system physically located on a vehicle operating in an environment, receiving from a sensor (e.g., camera, LiDAR) a patch of captured data [corresponding to receiving a sensed information unit] of an object located in a region of the environment.);
generating, by the processing circuit, an object embedding information item representing the object (Guo, in [0042-0047], further teaches that a neural network maps a patch of captured data of an object located in a region of the environment to a feature/ numerical vector in an embedding space. This feature vector, [0010], is an object embedding information item that encodes the object’s characteristics for classification [0047-0050].);
comparing the object embedding information item to a plurality of reference embeddings information items that represent reference embedding information items clusters ([0052-0056]: Guo describes maintaining a sensor data repository 125 where each stored sensor sample is associated / being compared with one or more embeddings, and a search index that associates each sensor sample with its embeddings. To make search efficient, Guo teaches slicing the embedding index for each embedding type using a clustering technique, e.g., k-means, producing multiple slices where each slice corresponds to one of k clusters and is represented by a prototype or mean (cluster center) for that cluster. Searching [uses the query embeddings to search the embeddings in the repository] and comparing the object embedding to each reference embedding (class or cluster centroid) in the embedding repository corresponds to this recited limitation.);
identifying, based on the comparing step, a matching reference embedding information item that represents a matching reference embedding information items cluster (Guo in [0055-0057] teaches, for each query embedding, identifying the slice whose prototype is closest to the query embedding, and then identifying embeddings within that slice that are closest to the query embedding according to a distance metric [0071-0073] (e.g., Euclidean distance or cosine similarity).);
classifying the object as being associated with an object classification that is associated with the matching reference embedding information item cluster ([0050], properties include one or more high level object type (e.g. class), [0076], embeddings, along with their properties, that are most relevant to the query embedding are returned in order to identify an object type of an object located in the environment region, [0077]: Guo teaches the system classifying by implementing labels can be associated with sensor samples, along with their properties, in the repository with a high-level classification that identifies an object type); and
wherein the classifying triggers a determination of a driving related operation to be executed by the vehicle ([0032-0034]: Guo teaches that on-board classifier in an autonomous-driving system are used to provides predictions to a planning subsystem, and the planning subsystem uses the predictions to make fully- or semi-autonomous driving decisions; e.g. based on a predicted traffic sign type, the planning subsystem can trigger an action to adjust vehicle trajectory and “apply the brakes” when the traffic sign is a yield sign).
While Guo discloses receiving a sensed information unit, Guo does not explicitly disclose that it is a cropped sensed information unit, but it would have been obvious to have included it. The reason is that Guo’s disclosure already operates on a localized portion of sensed data corresponding to a region / object of interest (i.e., a “patch” / region of the environment that is processed for embedding and matching) [0029-0030], [0066]; and a person of ordinary skill would have recognized that cropping the sensed information to that region / object of interest is a routine, predictable implementation detail in perception pipelines. Cropping is a well-known way to (i) focus the embedding / matching on the object rather than irrelevant background [0046-0047], (ii) reduce computational burden [0074], and (iii) improve robustness / accuracy of object recognition [0079], then outcomes that align with Guo’s stated goals of efficient and accurate recognition in an autonomous-vehicle context. Accordingly, using a cropped sensed information unit (e.g., a region-of-interest/object-centered patch extracted from a larger sensor frame or point cloud) in Guo’s disclosed workflow would have been an obvious modification that applies known input-preprocessing according to its established function and yields predictable results.
Regarding claim 11, Guo teaches the method of claim 1, further comprising dynamically updating the reference embedding information items clusters ([0015], [0051], and more specifically [0061]: Guo teaches the system updates dynamically the slices (embedding information k clusters) of the index of the repository to account for the newly added embeddings of a new kind of sensor sample entry).
Regarding claim 12, Guo teaches the method of claim 1, wherein the processing circuit applies a neural network to generate the object embedding information item (Guo, in [0042-0047], further teaches that a neural network maps a patch of captured data of an object located in a region of the environment to a feature/ numerical vector in an embedding space. This feature vector, [0010], is an object embedding information item that encodes the object’s characteristics for classification [0047-0050].).
Regarding claims 15 & 20, the rationale provided for claim 1 is incorporated herein. In addition, the method of claim 1 corresponds to the non-transitory computer-readable storage medium of claim 15, as well as the system of claim 20, and performs the steps disclosed herein. Therefore, the claims are all ineligible.
Claims 9–10 are rejected under 35 U.S.C. §103 as being unpatentable over Guo in view of Staudinger (Staudinger et al., US 2021/0264300 A1, 2021).
Regarding claim 9, Guo fails to disclose but Staudinger explicitly teaches the method of claim 1, wherein the reference embeddings information item clusters were generated during a supervised machine learning training comprising feeding cropped sensed information units to a bounding shape generating neural network (Staudinger, in [0065-0066], teaches that a “chip” is a crop out of the original image from the object neural network, and that the neural network may place a bounding polygon to enable cropping. Staudinger also teaches that the object detection model is trained via supervised learning using labeled image data [0067]. Staudinger then teaches providing images to the object detection model such that it predicts ROIs / bounding polygons [0089], converting predicted ROIs to first subregions by cropping the bounds from the images [0091], providing the cropped subregions to an embedding model to output feature vectors / embeddings [0093], and clustering the feature vectors of the first subregions into a plurality of clusters [0095]).
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify Guo’s embedding-based classification framework to generate the reference embeddings information item clusters during supervised machine-learning training using cropped sensed information units produced by a bounding-shape-generating neural network, as taught by Staudinger, because Guo already relies on embeddings to represent sensed information items for similarity matching / classification, and Staudinger provides a known, compatible technique for improving the quality and consistency of those embeddings by (i) using a supervised object-detection network to predict ROIs / bounding polygons, (ii) cropping object-centered chips / subregions from the sensed images, and (iii) clustering the resulting embeddings. Incorporating Staudinger’s ROI-based cropping and supervised training into Guo would have predictably reduced background clutter and improved the discriminative power and stability of the learned embeddings and resulting clusters, thereby yielding more reliable cluster formation and subsequent object classification, an expected benefit in computer-vision and autonomous-system perception pipelines.
Regarding claim 10, Guo [as modified by Staudinger] teaches the method of claim 9, wherein the supervised machine learning training comprises applying a cost function that induces generation of similar reference embeddings information items to similar objects and dissimilar reference embeddings information items to dissimilar objects (Staudinger explicitly teaches training the embedding model using a triplet / ranking loss or cost function that enforces an ordering of distances such that samples with the same label are closer in embedding space than samples with different labels [0030], [0076], and further teaches triplet ranking loss or cost function using an anchor / positive / negative relationship such that embeddings of similar objects are pulled together while embeddings of dissimilar objects are pushed apart [0077]).
Claim 19 is rejected under 35 U.S.C. §103 as being unpatentable over Guo in view of Afra (Afra et al., US 2020/0394772 A1, 2020)
Regarding claim 19, Guo fails to disclose but Afra explicitly teaches the non-transitory computer readable medium according to claim 15, wherein the object classification system is further configured to limit the dynamic range of the sensed information unit ([0239]: Afra teaches applying a tone mapping operator (e.g., a log function) to an HDR image such that “the log function compresses a range of values significantly” and further teaches safely clipping higher luminance values and scaling the log value (e.g., by 1/16) so that the values are compressed to the [0,1] range, i.e., limiting the dynamic range of the sensed image data [0243]).
It would have been obvious to one of ordinary skill in the art at the time of the invention to incorporate Afra’s dynamic-range limiting (tone mapping / clipping / scaling to [0,1]) into Guo’s sensed-image object-classification pipeline because both references process sensed images with ML models, and Afra expressly teaches that compressing HDR values into a bounded range (e.g., [0,1]) makes training faster and more stable and avoids issues with large HDR values in common representations (e.g., FP16), which is a predictable benefit when feeding images into neural networks for detection / embedding / classification.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KEN KUDO whose telephone number is (571)272-4498. The examiner can normally be reached M-F 8am - 5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached at 571-272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
KEN KUDO
Examiner
Art Unit 2671
/KEN KUDO/Examiner, Art Unit 2671
/VINCENT RUDOLPH/Supervisory Patent Examiner, Art Unit 2671