Last updated: May 29, 2026

Application No. 18/674,666

MODELING EQUIVARIANCE IN POINT CLOUDS USING NEURAL NETWORKS FOR THREE-DIMENSIONAL OBJECT DETECTION AND RECOGNITION

Non-Final OA §102§103§112

Filed

May 24, 2024

Priority

Oct 20, 2023 — provisional 63/544,949 +1 more

Examiner

DANG, DUY M

Art Unit

2662

Tech Center

2600 — Communications

Assignee

Nvidia Corporation

OA Round

1 (Non-Final)

Interview Optional

— +6.0% interview lift. Interview lift (+6.0%) is below the 15.0% threshold. A written response is recommended.

Based on 860 resolved cases, 2023–2026

Examiner Intelligence

DANG, DUY M View full profile →

Grants 91% — above average

Career Allowance Rate

786 granted / 860 resolved

+29.4% vs TC avg

Moderate +6% lift

Without

With

+6.0%

Interview Lift

resolved cases with interview

Typical timeline

2y 7m

Avg Prosecution

22 currently pending

Career history

882

Total Applications

across all art units

Statute-Specific Performance

§101

20.8%

-19.2% vs TC avg

§103

25.7%

-14.3% vs TC avg

§102

18.5%

-21.5% vs TC avg

§112

15.6%

-24.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 860 resolved cases

Office Action

§102 §103 §112

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Interpretation

The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

Use of the word “means” (or “step for”) in a claim with functional language creates a rebuttable presumption that the claim element is to be treated in accordance with 35 U.S.C. § 112(f) (pre-AIA  35 U.S.C. 112, sixth paragraph).  The presumption that § 112(f) (pre-AIA  § 112, sixth paragraph) is invoked is rebutted when the function is recited with sufficient structure, material, or acts within the claim itself to entirely perform the recited function.
Absence of the word “means” (or “step for”) in a claim creates a rebuttable presumption that the claim element is not to be treated in accordance with 35 U.S.C. § 112(f) (pre-AIA  35 U.S.C. 112, sixth paragraph).  The presumption that § 112(f) (pre-AIA  § 112, sixth paragraph) is not invoked is rebutted when the claim element recites function but fails to recite sufficiently definite structure, material or acts to perform that function.
Claim elements in this application that use the word “means” (or “step for”) are presumed to invoke § 112(f) except as otherwise indicated in an Office action.  Similarly, claim elements that do not use the word “means” (or “step for”) are presumed not to invoke § 112(f) except as otherwise indicated in an Office action.
Claim limitation “units” has/have been interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because it uses/they use a generic placeholder “units” coupled with functional language without reciting sufficient structure to achieve the function.  Furthermore, the generic placeholder is not preceded by a structural modifier.
Since the claim limitation(s) invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, claim(s) 11-20 has/have been interpreted to cover the corresponding structure described in the specification that achieves the claimed function, and equivalents thereof.
A review of the specification shows that the following appears to be the corresponding structure described in the specification for the 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph limitation: processors described in paragraph [0026] and memory described in paragraph [0030].
If applicant wishes to provide further explanation or dispute the examiner’s interpretation of the corresponding structure, applicant must identify the corresponding structure with reference to the specification by page and line number, and to the drawing, if any, by reference characters in response to this Office action.
If applicant does not intend to have the claim limitation(s) treated under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112 , sixth paragraph, applicant may amend the claim(s) so that it/they will clearly not invoke 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, or present a sufficient showing that the claim recites/recite sufficient structure, material, or acts for performing the claimed function to preclude application of 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
For more information, see M.P.E.P. § 2173 et seq. and Supplementary Examination Guidelines for Determining Compliance With 35 U.S.C. 112 and for Treatment of Related Issues in Patent Applications, 76 FR 7162, 7167 (Feb. 9, 2011).
Claims 1-10 are not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because they are all method claims.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 8 and 12 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The recitation “L1” in each of claims 8 and 12 lacks clarity as to what it is referred.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of pre-AIA  35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claim(s) 1-4 and 7-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Mustikovela et al. (U.S. Pat. App. Pub. No. 2021/0150757 A1, referred as Mustikovela hereinafter).
Regarding claim 1 as a representative claim, Mustikovela teaches a method comprising:
determining a first prediction associated with partitioning a plurality of points in a representation of a scene into a first set of parts (see figure 8 and para. [0118] (first prediction 808; V1; Z1; for example, “first prediction 808 corresponds to a predicted specific orientation of an object within an image, and comprises specific values for a set of parameters comprising an azimuth parameter, an elevation parameter, and a tilt parameter. In at least one embodiment, a first prediction 808 comprises a first predicted viewpoint V1 and a first predicted set of appearance parameters Z1 of input image 802. In at least one embodiment, first prediction 808 comprises a predicted orientation of a car depicted in input image 802.”; also see discriminator depicted in figures 1-7 and 9-12));
generating, using a neural network, a second prediction associated with partitioning the plurality of points into a second set of parts based at least on one or more aggregations associated with the first set of parts (see figure 8 and para. [0118] (discriminator 806 employs one or more neural networks; second prediction 810; V2; Z2; for example, “In at least one embodiment, second prediction 810 corresponds to a predicted specific orientation of an object within an image, and comprises specific values for a set of parameters comprising an azimuth parameter, an elevation parameter, and a tilt parameter. In at least one embodiment, a second prediction 810 comprises a second predicted viewpoint V2 and a second predicted set of appearance parameters Z2 of transformed image 804. In at least one embodiment, second prediction 810 comprises a predicted orientation of a car depicted in transformed image 804.”; in view of BRI, classification (described all over place), collection (202 of figure 2; 102 of figure 1) parameters, appearance, and orientations correspond to the so-called at least aggregations associated with the first set of parts); also see discriminator depicted in figures 1-7 and 9-12);
determining a plurality of piecewise equivariant regions in the scene based on the second prediction (see paras. [0119] – [0120] (matching predictions between images 802 and 804)); and
generating an object recognition result associated with the plurality of points based on the plurality of piecewise equivariant regions (see paras. [0077] (real/fake classification generates an object recognition results based on the predictions of the discriminator), [0090] (classification 416 generates an object recognition based on the prediction whether images are real or fake)).
Regarding claim 2, Mustikovela further teaches wherein the determining the first prediction comprises randomly assigning at least one point included in the plurality of points to a part included in the first set of parts and (see para. [0118] (V1; Z1;);
Regarding claim 3, Mustikovela further teaches wherein the first prediction is generated via execution of one or more equivariant layers of the neural network (see paras. [0118] (discriminator 806 employs one or more neural networks), and [0168] (CNN is used for predicting equivariant viewpoint (V)).
Regarding claim 4, Mustikovela further teaches wherein the second prediction comprises a conditional probability distribution over a plurality of partitions associated with the plurality of points (see paras . [0132] & [0169] (the use of probability for prediction viewpoints).
Regarding claim 7, Mustikovela further teaches further comprising updating one or more parameters of the neural network based on one or more losses computed between the second prediction and a set of ground-truth partition assignments associated with the plurality of points (see paras. [0069] (loss (a generative consistency loss, a symmetry loss, a nearest neighbor and farthest neighbor loss, and a disentanglement loss) and lacking ground truth annotations or unavailable ground truth annotations), [0073] (“viewpoint 206 is determined based on  the ground truth annotations provided”, so when it is unavailable, there is a loss.) and [0077] (ground truth is available for computing loss); also see para. [0090]).
Regarding claim 8, Mustikovela further teaches further wherein the one or more losses comprise an L1 loss (see rejection applied to claim 7 above).
Regarding claim 9, Mustikovela further teaches wherein the object recognition result comprises at least one of a classification result, a segmentation result, or an object detection result (see paras. [0077] (real/fake classification; object recognition results based on the predictions of the discriminator), [0090] (classification 416 generates an object recognition based on the prediction whether images are real or fake), [0242] (object detection and classification; object tracking) and [0298] (distinguishing between static and moving objects); also see paras. [0551] (group of geometric objects has been processed; it corresponds to segmentation result) and [0637] – [0638] (separate objects; objects in separate categories)) .
Regarding claim 10, Mustikovela further teaches wherein the plurality of points is included in a point cloud (see paras. [0303] (point cloud), [0082] (viewpoint indicates a 3D rotation of an object; thus, depth (point cloud) is inherently included), [0088] (viewpoint indicates a 3D rotation of an object), [0242] (depth based object detection; this implies a point cloud (depth)).
Regarding claim 11, it is noted that claim recites similar claim limitations called for in the counterpart claim 1 and thus the advanced statements as applied to claim 1 above are incorporated hereinafter.  Mustikovela further teaches one or more processing units (see paras. [0001] (processors) and [0153] (one or more processors)).
Regarding claim 12, Mustikovela further teaches wherein the operations further comprise updating one or more parameters of the neural network based on an Ll loss computed using at least one of the first prediction, the second prediction, or a set of ground-truth partitions associated with the plurality of points (see para. [0170) (using computed loss to update parameters).
Regarding claim 13, Mustikovela further teaches generating, via execution of a first equivariant layer of the neural network, a third prediction associated with the plurality of points based at least on input that includes the first prediction (see para. [0168]: predicting the third viewpoint); and generating, via execution of a second equivariant layer of the neural network, the second prediction based at least on input that includes the third prediction (see paras. [0125] (viewpoints 914-918 comprise predicted orientations of cars depicted in images 902-906; cars as input, include third predicted orientation).
Regarding claim 14, Mustikovela further teaches wherein generating the second prediction comprises: generating, using the neural network, one or more sets of equivariant features associated with the first prediction (see para. [0118] (discriminator 806 generates a first prediction 808, wherein the discriminator employ the neural networks as pointed in the rejected applied to claim 1 above; 808 comprises parameters which comprise viewpoint-equivariant distances (para. [0122])); and generating the second prediction based on the one or more sets of equivariant features (see para. [0118] (discriminator 806 generates a second prediction 810, wherein the discriminator employ the neural networks as pointed in the rejected applied to claim 1 above; 808 comprises parameters which comprise viewpoint-equivariant distances (para. [0122]).
Regarding claim 15, Mustikovela further teaches wherein the first set of parts is larger than the second set of parts (see para. [0129] (first viewpoint is of an anchor image; second viewpoint is of a nearest neighbor image; this implies the first viewpoint is larger than the second viewpoint).
Regarding claim 16, Mustikovela further teaches 16 wherein the neural network includes one or more equivariant layers in an encoder that converts the plurality of points into a latent representation (see para. [0610] (object’s orientation is encoded on a set of parameters) and [0062] (predicted orientation or viewpoint is encoded); also see para. [0134] (autoencoder)).
Regarding claim 17, Mustikovela further teaches wherein the one or more equivariant layers are further included in a decoder that generates the object recognition result based at least on the latent representation (see paras. [0629] (computing a reconstruction loss corresponds to a decoder) and [0099]; also see para. [0134] (autoencoder)).
Regarding claim 18, Mustikovela further teaches wherein the processor is comprised in at least one of: a control system for an autonomous or semi-autonomous machine (see paras. [0056] and [0147]); a perception system for an autonomous or semi-autonomous machine (paras. [0056], [0296] and [0317]); a system for performing one or more simulation operations (see paras. [0329]); a system for performing one or more digital twin operations (see para. [0071] (real/fake classification); a system for performing light transport simulation (see para. [0288] (flashing light)); a system for performing collaborative content creation for 3D assets (see paras. [282] (3D rendering) and [0303] (forming 3D data)); a system for performing one or more deep learning operations (see paras. [0211] and [0261]); a system implemented using an edge device (see figures 21A & 21C (vehicle 2100 comprising controllers 2136 and SoC 2104B and 2104A); these refer to edge device)); a system for generating or presenting at least one of virtual reality content, augmented reality content, or mixed reality content (see para. [0495]); a system implemented using a robot (see para. [0227]); a system for performing one or more conversational AI operations (see paras. [0330] and [644]); a system for performing one or more generative AI operations (see paras. [0330] and [644]); a system implementing one or more large language models (LLMs (see para. [0563] (real-time language translation); a system implementing one or more vision language models (VLMs) (see para. [0231]); a system for generating synthetic data (see para. [0191]; a system incorporating one or more virtual machines (VMs) (see para. [0216] (virtual machines VMs); a system implemented at least partially in a data center (see para. [0215]); or a system implemented at least partially using cloud computing resources (see paras. [0219] (Google cloud and Microsoft Azure) and [0239]).
Regarding claim 19, it is noted that claim recites similar claim limitations called for in the counterpart claim 11 and thus the advanced statements as applied to claim 11 above are incorporated hereinafter.  Mustikovela further teaches one or more memory units storing instructions (see para. [0500]).
Regarding claim 20, it is noted that claim recites similar claim limitations called for in the counterpart claim 18 and thus the advanced statements as applied to claim 18 above are incorporated hereinafter.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Mustikovela in view of Zhu (U.S. Pat. App. Pub. No. 2022/0414821 A1, referred as Zhu hereinafter).
The advanced statements as applied to claims 1-4 and 7-20 with regard to Mustikovela above are incorporated hereinafter.
Regarding claim 6, Mustikovela does not teach claim limitations “wherein the conditional probability distribution comprises a mixture of Gaussian distributions”.
However, such claim limitations are well known in the art as evidenced by Zhu.
Zhu, in the same field of endeavor that of image processing, teaches an equivariant neural network (see abstract, para. [0043] (equivariant encoder)) employing rotation operation (see para. [0043]) and mixture of Gaussian distributions (see paras. [0059] and [0072]).
The motivation for doing so is to improve matching as suggested by Zhu in paragraphs [0059 and [0072].
Therefore, before the effective filing of instant claim invention, it would have been obvious to one of ordinary skill in the art to incorporate such claim limitations as taught by Zhu in combination with Mustikovela for that reasons.
Allowable Subject Matter
Claim 5 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:
Regarding claim 5. The method of claim 4, wherein the determining the second prediction comprises determining a plurality of centers associated with the plurality of piecewise equivariant regions based at least on an energy function that comprises a log-likelihood of the conditional probability distribution.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Jain et al. (U.S. Pat. App. No. 2022/0076032 A1) a detecting system for detecting lane, freespace and object (figure 1A) employing machine learning models and neural networks (para. [0022]).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DUY M DANG whose telephone number is (571)272-7389. The examiner can normally be reached Monday to Friday from 7:00AM to 3:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Amandeep Saini can be reached at 571-272-3382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




DMD
4/2026
/DUY M DANG/Primary Examiner, Art Unit 2662

Read full office action

Prosecution Timeline

May 24, 2024

Application Filed

May 07, 2026

Non-Final Rejection mailed — §102, §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/378,485

Patent 12639782

Scalable Image Storage Format

2y 7m to grant Granted May 26, 2026

18/647,447

Patent 12639834

TRAINING METHOD OF A NEURAL NETWORK MODEL FOR CT IMAGE REGISTRATION

2y 1m to grant Granted May 26, 2026

19/301,826

Patent 12641230

IMAGE ENCODING METHOD/DEVICE, IMAGE DECODING METHOD/DEVICE AND RECORDING MEDIUM HAVING BITSTREAM STORED THEREON

9m to grant Granted May 26, 2026

17/910,772

Patent 12628933

RECOMMENDING A FOUNDATION PRODUCT FROM AN IMAGE

3y 8m to grant Granted May 19, 2026

18/241,785

Patent 12633081

IMAGE PROCESSING SYSTEM, IMAGE PROCESSING METHOD, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM

2y 8m to grant Granted May 19, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

91%

Grant Probability

97%

With Interview (+6.0%)

2y 7m (~7m remaining)

Median Time to Grant

Low

PTA Risk

Based on 860 resolved cases by this examiner. Grant probability derived from career allowance rate.