Last updated: April 19, 2026

Application No. 18/800,325

SYSTEMS AND METHODS FOR MULTIMODAL GROUND TRUTH SAMPLING

Non-Final OA §101§103

Filed

Aug 12, 2024

Examiner

MCDOWELL, JR, MAURICE L

Art Unit

2612

Tech Center

2600 — Communications

Assignee

Noblis Inc.

OA Round

1 (Non-Final)

Interview Optional

— +12.9% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 913 resolved cases, 2023–2026

Examiner Intelligence

MCDOWELL, JR, MAURICE L View full profile →

Grants 86% — above average

Career Allow Rate

790 granted / 913 resolved

+24.5% vs TC avg

Moderate +13% lift

Without

With

+12.9%

Interview Lift

resolved cases with interview

Typical timeline

3y 0m

Avg Prosecution

23 currently pending

Career history

936

Total Applications

across all art units

Statute-Specific Performance

§101

16.1%

-23.9% vs TC avg

§103

47.7%

+7.7% vs TC avg

§102

12.8%

-27.2% vs TC avg

§112

7.7%

-32.3% vs TC avg

Black line = Tech Center average estimate • Based on career data from 913 resolved cases

Office Action

§101 §103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-11 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.
The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because claim 1 is directed to a method of multimodal ground truth sampling for creating synthetic multimodal training data, with the steps of: selecting, applying, generating, generating and training which are nothing more than software instructions. Software instructions are non-statutory under 35 U.S.C. 101.
Claims 2-11 depend from claim 1 and contain additional steps, for example claim 2 comprises the steps of: intersecting, sampling and replacing; therefore claims 2-11 have the same problem as claim 1 and are rejected under the same rationale.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 12 and 23 is/are rejected under 35 U.S.C. 103 as being unpatentable over GOYAL (US2023/0234233A1) in view of YAN (JP2020034559A) in view of AGGARWAL (US2026/0030791A1) in view of SHIM (US2025/0139945A1) in view of SCHWARTZ (US2021/0174131A1).

Regarding claim 1, GOYAL teaches: 
1. A method of multimodal ground truth sampling for creating synthetic multimodal training data, the method performed by one or more processors, the method comprising (GOYAL: par. 88 lines 1-3 and 11-13; POSITA would recognize that a system includes a method):
selecting a source object from a dataset (GOYAL: par. 82 lines 23-29);
determining a valid pose transformation from a set of proposed pose transformations (GOYAL: par. 82 lines 29-32);
applying the valid pose transformation to the source object to create a transformed object (GOYAL: par. 82 lines 29-32);
GOYAL doesn’t teach however the analogous prior art YAN teaches: 
generating synthetic point cloud data based on a computer-aided design model and a destination point cloud (YAN: pg. 2, Background-Art lines 1-3; note: the computer-aided design model is interpreted as an object); and
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine generating synthetic point cloud data based on a computer-aided design model and a destination point cloud as shown in YAN with GOYAL for the benefit of addressing a shortcoming in the prior art in that a method of acquiring the position and orientation of an obstacle in the environment is to roughly detect the obstacle from actual road driving data, and then convert the distribution of these obstacles into a three-dimensional scene map based on the positioning information. However, the position and orientation information of the obstacle obtained by employing the method of the prior art, when compared with the position and orientation information of the actual obstacle, is several meters in three directions, that is, the horizontal direction, the vertical direction and the vertical direction. Further, an offset error of several tens of meters may occur, and at the same time, the pitch angle, the roll angle, and the yaw angle also have different degrees of error. Therefore, the obstacle positioning information obtained by the conventional technique has a larger error [YAN: pg. 2 lines 8-18].

The previous combination of GOYAL and YAN doesn’t teach however the analogous prior art AGGARWAL teaches:
generating synthetic image data based on the transformed object and a destination image (AGGARWAL: par. 30; note: the scene is being interpreted as the destination image); and the object referred to as being taught by YAN above being a transformed object (AGGARWAL: par. 30).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine generating synthetic image data based on the transformed object and a destination image; and the object referred to as being taught by YAN above being a transformed object as shown in AGGARWAL with the previous combination for the benefit of improving on conventional image generation models by generating synthetic images that depict a target object more accurately. For example, users can obtain synthetic images with an object that is similar to the identity of a target object (concept) from a reference image. Embodiments of the present disclosure achieve this improved accuracy by training an object encoder that takes one or more transform parameters as input. The transform parameter indicates how much of a specified transformation to apply. For example, the transform input can include a level of transformation of size parameter, an identity parameter, or both. Accordingly, quality and accuracy of synthetic images are improved [AGGARWAL par. 32].
The previous combination of GOYAL in view of YAN in view of AGGARWAL doesn’t teach however the analogous prior art SHIM teaches:
training a machine learning model from synthetic multimodal training data comprising the synthetic image data and the synthetic point cloud data (SHIM: par. 72).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine training a machine learning model from synthetic multimodal training data comprising the synthetic image data and the synthetic point cloud data as shown in SHIM with the previous combination for the benefit of addressing a shortcoming in the prior art in that lack of sufficient training data during the learning process of neural network-based models may affect the convergence speed of learning and the performance of models. When training data is insufficient, data augmentation technology may be used to augment training data. For example, training data may be augmented by generating new data by combining different pieces of data included in training data with each other or by transforming data included in training data using techniques such as rotation and color change [SHIM par. 3].
The previous combination of GOYAL in view of YAN in view of AGGARWAL in view of SHIM doesn’t teach however the analogous prior art SCHWARTZ teaches: the machine learning model is a computer vision machine learning model (SCHWARTZ: par. 7 lines 19-21).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the machine learning model is a computer vision machine learning model as shown in SCHWARTZ with the previous combination for the benefit of addressing a shortcoming in the prior art in that first, the detect/recognize module in the computer vision module and the image model using in the training module operates on minimally processed image data. For example, a region of interest is identified in the camera images and image portions are cropped and scaled from RGB (or grayscale or other) camera image. No binarization of images or conversion of RGB or grayscale images to binary images is needed. No edge detection, no connected component analysis, no line finding or Hough transform, no fiducial or template images, and no conversion to polar images or intensity profiles are performed or used [SCHWARTZ par. 70].

Claim 12 is analogous to claim 1 and is therefore rejected using the same rationale. 
Claim 12 further requires a different preample, and additional limitations also taught by GOYAL:
A system for multimodal ground truth sampling for creating synthetic multimodal training data, the system comprising (GOYAL: pars. 59 and 61).
one or more processors (GOYAL: par. 61); and 
memory storing computer program code executable by the one or more processors to cause the system to (GOYAL: par. 61):

Claim 23 is analogous to claim 1 and is therefore rejected using the same rationale. 
Claim 23 further requires a different preample, also taught by GOYAL:
A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which, when executed by a system, cause the system to: (GOYAL: par. 63 lines 23-27).


Claim(s) 4 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over GOYAL in view of YAN in view of AGGARWAL in view of SHIM in view of SCHWARTZ in view of LI et. al., (2024, July). “SSP3D5000: a synthetic dataset for ship 3D perception” In International Conference on Optics and Machine Vision (ICOMV 2024) (Vol. 13179, pp. 135-141). SPIE.

Regarding claim 4, the previous combination of GOYAL in view of YAN in view of AGGARWAL in view of SHIM in view of SCHWARTZ don’t teach however the analogous prior art LI teaches: 
4. The method of claim 1, further comprising labeling the synthetic multimodal training data with one or more of an object class, a yaw, a length, a width, a height, an x-coordinate, a y-coordinate, or a z-coordinate (LI: pg. 2, section 1, lines 5-9, pg. 3, sub-section 2.2, lines 1-4).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine labeling the synthetic multimodal training data with one or more of an object class, a yaw, a length, a width, a height, an x-coordinate, a y-coordinate, or a z-coordinate as shown in LI with the previous combination for the benefit of addressing a shortcoming in the prior art in that currently in the field of ship perception, datasets lack 3D information fusing images and point clouds, and the real dataset faces difficulties such as collecting data in extreme working conditions and the low accuracy of data labeling [LI see abstract lines 1-2].

Claim 15 is analogous to claim 4 and is therefore rejected using the same rationale. 

Claim(s) 8, 10-11, 19 and 21-22 is/are rejected under 35 U.S.C. 103 as being unpatentable over GOYAL in view of YAN in view of AGGARWAL in view of SHIM in view of SCHWARTZ in view of KOHLI (US2012/0288186A1).

Regarding claim 8, the previous combination of GOYAL in view of YAN in view of AGGARWAL in view of SHIM in view of SCHWARTZ don’t teach however the analogous prior art KOHLI teaches: 
8. The method of claim 1, wherein determining a valid pose transformation from a set of proposed pose transformations comprises determining that applying the proposed pose transformation to the source object to create a transformed object would not cause the transformed object to violate one or more occlusion criteria (KOHLI: par. 55 lines 1-6 and 17-20).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine determining that applying the proposed pose transformation to the source object to create a transformed object would not cause the transformed object to violate one or more occlusion criteria as shown in KOHLI with the previous combination for the benefit of addressing a shortcoming in the prior art in that gathering and annotating training images is expensive, time consuming, and requires human input. For example, images of certain object types may be gathered using textual queries to existing image search engines that are filtered by human labelers that annotate the images. Such approaches are expensive or unreliable for object localization and segmentation because human interaction is required to provide accurate bounding boxes and segmentations of the object. Alternatively, algorithms requiring less training data may be used for object localization and segmentation. The algorithms identify particular invariant properties of an object to generalize all modes of variation of the object from existing training data. However, the accuracy of object recognition systems increases with the amount of training data. Accordingly, it is a challenge to develop large enough training sample sets to obtain satisfactory results [KOHLI par. 2].

Regarding claim 10, GOYAL in view of YAN in view of AGGARWAL in view of SHIM in view of SCHWARTZ as modified by KOHLI (with the same motivation from claim 8) further teaches: 
10. The method of claim 8, wherein determining a valid pose transformation from a set of proposed pose transformations comprises determining that applying the proposed pose transformation to the source object to create a transformed object would not cause the transformed object to overlap with one or more objects in the destination image (KOHLI: par. 55 lines 1-6 and 17-20).

Regarding claim 11, GOYAL in view of YAN in view of AGGARWAL in view of SHIM in view of SCHWARTZ as modified by KOHLI (with the same motivation from claim 8) further teaches:
11. The method of claim 1, wherein the source object is one of a vehicle, a pedestrian, or a bicyclist (KOHLI: par. 55 lines 9-17).

Claim 19 is analogous to claim 8 and is therefore rejected using the same rationale. 

Claim 21 is analogous to claim 10 and is therefore rejected using the same rationale. 

Claim 22 is analogous to claim 11 and is therefore rejected using the same rationale. 

Allowable Subject Matter
Claims 13-14, 16-18 and 20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Claims 2-3, 5-7 and 9 would be objected to (except for the 101 rejection) as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:
Regarding claims 2-3, 5-7, 9, 13-14, 16-18 and 20, the prior art doesn’t teach: 
2. The method of claim 1, wherein generating synthetic image data based on the transformed object and the destination image comprises:
intersecting simulated camera rays with locations on a transformed object mesh, wherein the transformed object mesh corresponds to the transformed object;
sampling pixel values in a source image based on the intersected locations, wherein the source image corresponds to the source object; and
replacing pixel values of the destination image with the sampled pixel values in the source image.

3. The method of claim 1, wherein generating synthetic point cloud data based on the transformed object and the destination point cloud comprises:
removing points from the destination point cloud that are occluded by the transformed object;
intersecting simulated LiDAR rays with locations on a transformed object mesh, wherein the transformed object mesh corresponds to the transformed object;
sampling intensity values in a source object mesh based on the intersected locations, wherein the source object mesh corresponds to the source object;
adding points to the destination point cloud at locations on the transformed object corresponding to the intersected locations; and
assigning intensity values for the added points based on the sampled intensity values.


5. The method of claim 1, wherein prior to selecting a source object from a dataset, the method comprises:
combining LiDAR points from one or more source point clouds into a combined point cloud;
constructing a source object mesh from the combined point cloud;
constructing a source object from the source object mesh and one or more source images; and
saving the source object to the dataset.

6. The method of claim 5, wherein constructing the source object mesh from the combined point cloud comprises removing outlier points.

7. The method of claim 5, wherein constructing the source object mesh from the combined point cloud comprises removing points corresponding to a ground plane.

9. The method of claim 8, wherein determining a valid pose transformation from a set of proposed pose transformations comprises:
measuring a first pixel length of a bounding box around the source object;
measuring a second pixel length of a bounding box around the source object transformed using a proposed pose transformation;
computing a ratio of the second pixel length to the first pixel length; and
determining that the ratio does not exceed a distortion threshold.

13. The system of claim 12, wherein generating synthetic image data based on the transformed object and the destination image comprises:
intersecting simulated camera rays with locations on a transformed object mesh, wherein the transformed object mesh corresponds to the transformed object;
sampling pixel values in a source image based on the intersected locations, wherein the source image corresponds to the source object; and
replacing pixel values of the destination image with the sampled pixel values in the source image.

14. The system of claim 12, wherein generating synthetic point cloud data based on the transformed object and the destination point cloud comprises:
removing points from the destination point cloud that are occluded by the transformed object;
intersecting simulated LiDAR rays with locations on a transformed object mesh, wherein the transformed object mesh corresponds to the transformed object;
sampling intensity values in a source object mesh based on the intersected locations, wherein the source object mesh corresponds to the source object;
adding points to the destination point cloud at locations on the transformed object corresponding to the intersected locations; and
assigning intensity values for the added points based on the sampled intensity values.

16. The system of claim 12, wherein prior to selecting a source object from a dataset, the system is caused to:
combine LiDAR points from one or more source point clouds into a combined point cloud;
construct a source object mesh from the combined point cloud;
construct a source object from the source object mesh and one or more source images; and
save the source object to the dataset.

17. The system of claim 16, wherein constructing the source object mesh from the combined point cloud comprises removing outlier points.

18. The system of claim 16, wherein constructing the source object mesh from the combined point cloud comprises removing points corresponding to a ground plane.

20. The system of claim 19, wherein determining a valid pose transformation from a set of proposed pose transformations comprises:
measuring a first pixel length of a bounding box around the source object;
measuring a second pixel length of a bounding box around the source object transformed using a proposed pose transformation;
computing a ratio of the second pixel length to the first pixel length; and
determining that the ratio does not exceed a distortion threshold.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MAURICE L MCDOWELL, JR whose telephone number is (571)270-3707. The examiner can normally be reached Mon-Thur & Sat: 2pm-10pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Said A. Broome can be reached at 571-272-2931. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MAURICE L. MCDOWELL, JR/Primary Examiner, Art Unit 2612

Read full office action

Prosecution Timeline

Aug 12, 2024

Application Filed

Mar 11, 2026

Non-Final Rejection — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/608,796

Patent 12602875

TECHNIQUE FOR THREE DIMENSIONAL (3D) HUMAN MODEL PARSING

2y 5m to grant Granted Apr 14, 2026

18/679,748

Patent 12602887

AUGMENTED REALITY CONTROL SURFACE

2y 5m to grant Granted Apr 14, 2026

18/419,611

Patent 12598281

CONTROL APPARATUS, CONTROL METHOD, AND STORAGE MEDIUM FOR DETERMINING A CAMERA PATH INDICATING A MOVEMENT PATH OF A VIRTUAL VIEWPOINT IN A THREE-DIMENSIONAL SPACE

2y 5m to grant Granted Apr 07, 2026

18/422,070

Patent 12579741

DETECTING THREE DIMENSIONAL (3D) CHANGES BASED ON MULTI-VIEWPOINT IMAGES

2y 5m to grant Granted Mar 17, 2026

18/611,236

Patent 12561905

Optimizing Generative Machine-Learned Models for Subject-Driven Text-to-3D Generation

2y 5m to grant Granted Feb 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

86%

Grant Probability

99%

With Interview (+12.9%)

3y 0m

Median Time to Grant

Low

PTA Risk

Based on 913 resolved cases by this examiner. Grant probability derived from career allow rate.