DETAILED ACTION
This Office Action is sent in response to Applicant’s Communication received 12/30/2025 for application number 17/903,796.
Claims 1-9, 11-23 are pending.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-9, 11-23 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Independent claims 1, 11, and 20 recite:
obtaining a plurality of sample data points; computing, by a data generator, a plurality of synthetic data points based the plurality of sample data points; determining one or more characteristics corresponding to the sample data points; selecting a first operation to partition the plurality of sample data points into machine learning training data and machine learning validation data by allocating each of the plurality of sample data points into one of the machine learning training data or the machine learning validation data; selecting a second operation to partition the plurality of synthetic data points into the machine learning training data and the machine learning validation data; by allocating each of the plurality of sample data points into one of the machine learning training data or the machine learning validation data; wherein at least one of selecting the first operation and selecting the second operation comprises: determining a number of data points, of the plurality of sample data points and the plurality of synthetic data points, allocated into one of the machine learning training data or the machine learning validation data is based on the one or more characteristics corresponding to the plurality of sample data points; partitioning, using the first operation, the plurality of sample data points into the machine learning training data and the machine learning validation data; partitioning, using the second operation, the plurality of synthetic data points into the machine learning training data and the machine learning validation data; training a machine learning model using the machine learning training data; and validating the machine learning model using the machine learning validation data.
(2A, prong 1) The underlined portions of the claim recite an abstract idea, specifically a mental process. A human, with aid of pen and paper, can obtain sample data (for example, writing down a time and measurement like temperature), mentally create synthetic data based on the sample data (such as interpolating sample points between the measured time and temperature), and then divide the sample data and synthetic data into training and validation data sets based on the characteristics of the sample data (a human can mentally allocate the measured and interpolated data points into training and validation data based on the measured data; for example using more interpolated data points when there are fewer measured data points in a period of time, so the training and validation data is balanced).
(2A, prong 2) This judicial exception is not integrated into a practical application. The additional elements (1) of generic computer elements including a non-transitory medium in claim 1, computer in claim 11, and processor in claim 20 are mere instructions to apply the exception because these limitations merely add generic computer components after the fact to the abstract idea. Additional elements (2) training an ML model with the training data set and (3) validating the ML model with the validation data set are also mere instructions to apply the exception. These elements do merely recite the solution that the ML model is trained and validated but do not specify how the training and validation is accomplished. Even when additional elements (1)-(3) are taken together in the claim as a whole, they do not integrate the abstract idea into a practical application because they only add instructions to apply the exception to the abstract idea.
(2B) The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. Additional elements (1)-(3) are mere instructions to apply the exception, as explained above. Even when additional elements (1)-(3) are taken together in the claim as a whole, they do not amount to significantly more than the abstract idea itself because they instructions to apply the exception to the abstract idea by adding generic computer components after the fact to the abstract idea and reciting only the idea of an outcome to the abstract idea.
Dependent claims 2-9, 12-19, and 21-23 add additional mental steps to the abstract idea. A human can mentally judge a quantity or quality of sample data for partitioning (for claims 2-3 and 12-13), mentally judge the characteristics of the sample data when partitioning the sample and synthetic data (claims 4-5 and 14-15), mentally partitioning some or all of the sample and/or synthetic data into training and validation data (claims 7-9 and 17-19), and partition data based on characteristics comprising a quantity of data, quantity of data in a class, or variation in data in a class (claims 21-23). The partitioning defined as hyperparameters used for training (claims 6 and 16) is a mental step because the specification states that, “Prior to training, a user selects hyperparameters that control the training process.” Spec. [0023] as filed.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claim(s) 1, 3, 5-6, 8-9, 11, 13, 15-16, 18-20 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Al Faruque et al., (US 2023/0230484 A1).
In reference to claim 1, Al Faruque discloses one or more non-transitory computer-readable media (memory, para. 0066) storing instructions that, when executed by one or more hardware processors, cause performance of operations comprising: obtaining a plurality of sample data points (real-life images, para. 0070); computing, by a data generator, a plurality of synthetic data points based the plurality of sample data points (simulated driving data is generated to cover risky “corner cases” or other data that is not sufficiently covered in the real-world data, para. 0008, 0112-16, para. 0056 and fig. 17); determining one or more characteristics corresponding to the sample data points (risk score for points is determined, para. 0156-62); selecting a first operation to partition the plurality of sample data points into machine learning training data and machine learning validation data by allocating each of the plurality of sample data points into one of the machine learning training data or the machine learning validation data (real-world dataset, like “571-honda” is split randomly while maintaining the proportion of risky to safe lane change clips, para. 0162); selecting a second operation to partition the plurality of synthetic data points into the machine learning training data and the machine learning validation data by allocating each of the plurality of sample data points into one of the machine learning training data or the machine learning validation data (synthetic dataset, like “271-syn” is split randomly while maintaining the proportion of risky to safe lane change clips, para. 0162); wherein at least one of selecting the first operation and selecting the second operation comprises: determining a number of data points, of the plurality of sample data points and the plurality of synthetic data points, allocated into one of the machine learning training data or the machine learning validation data is based on the one or more characteristics corresponding to the plurality of sample data points (the splitting operations are based on the proportion of risky to safe lane changes in the data, para. 0162; maintaining the proportion of risky and safe lane changes is basing the allocation based on a characteristic of the data, specifically the classes of sample data and number of sample data points); partitioning, using the first operation, the plurality of sample data points into the machine learning training data and the machine learning validation data; partitioning, using the second operation, the plurality of synthetic data points into the machine learning training data and the machine learning validation data (data is partitioned, para. 0162); training a machine learning model using the machine learning training data; and validating the machine learning model using the machine learning validation data (model is trained and validated, para. 0163-69).
In reference to claim 3, Al Faruque discloses the media of claim 1, wherein the one or more characteristics corresponding to the plurality of sample data points comprises variation among the plurality of sample data points (split is based on variation, i.e. the proportion of proportion of risky to safe lane changes in the data, para. 0162).
In reference to claim 5, Al Faruque discloses the media of claim 1, wherein the characteristics of sample data points determine the partitioning of the sample data points (split of real-world data based on variation of real-world data, i.e. the proportion of proportion of risky to safe lane changes in the data, para. 0162).
In reference to claim 6, Al Faruque discloses the media of claim 1, wherein the operations further comprise partitioning synthetic data points and sample data points defined as hyperparameters used for training machine learning model (framework for splitting data is defined as part of training hyperparameters, para. 0126).
In reference to claim 8, Al Faruque discloses the media of claim 1, wherein the first operation comprises: allocating a first portion of the plurality of sample data points into the machine learning training data; and allocating a second portion of the plurality of sample data points into the machine learning validation data (real-world data split 7:3, para. 0162).
In reference to claim 9, Al Faruque discloses the media of claim 8. wherein the second operation comprises: allocating a first portion of the plurality of synthetic data points into the machine learning training data; and allocating a second portion of the plurality of synthetic data points into the machine learning validation data (simulated data split 7:3, para. 0162).
In reference to claim 11, this claim is directed to a method associated with the non-transitory computer-readable media claimed in claim 1 and is therefore rejected under a similar rationale.
In reference to claim 13, this claim is directed to a method associated with the non-transitory computer-readable media claimed in claim 3 and is therefore rejected under a similar rationale.
In reference to claim 15, this claim is directed to a method associated with the non-transitory computer-readable media claimed in claim 6 and is therefore rejected under a similar rationale.
In reference to claim 16, this claim is directed to a method associated with the non-transitory computer-readable media claimed in claim 6 and is therefore rejected under a similar rationale.
In reference to claim 18, this claim is directed to a method associated with the non-transitory computer-readable media claimed in claim 8 and is therefore rejected under a similar rationale.
In reference to claim 19, this claim is directed to a method associated with the non-transitory computer-readable media claimed in claim 9 and is therefore rejected under a similar rationale.
In reference to claim 20, this claim is directed to a system associated with the non-transitory computer-readable media claimed in claim 1 and is therefore rejected under a similar rationale.
In reference to claim 22, Al Faruque discloses the media of claim 1, wherein: the characteristics comprise a quantity of data points in one or more classes of the sample data points (split is based on maintaining a proportion of risky to safe lane changes in the training and validation sets, para. 0162).
In reference to claim 23, Al Faruque discloses the media of claim 1, wherein: the characteristics comprise a variation in one or more classes of data points included in the sample data points (split is based on maintaining a proportion of risky to safe lane changes in the training and validation sets, para. 0162; this is a variation in the classes of data because Al Faruque is trying to ensure both the training and validation sets have the same variation, or difference, in classes).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 2, 4, 12, 14, and 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Al Faruque et al., (US 2023/0230484 A1) in view of Joseph, Optimal ratio for data splitting (see attached NPL).
In reference to claim 2, Al Faruque does not explicitly teach the media of claim 1, wherein the one or more characteristics corresponding to the plurality of sample data points comprises a quantity of the plurality of sample data points.
Joseph teaches the media of claim 1, wherein the one or more characteristics corresponding to the plurality of sample data points comprises a quantity of the plurality of sample data points (see pages 535-36: the ratio for splitting data set into training and validation data is calculated based in part on the number of rows N in dataset).
It would have been obvious to one of ordinary skill in art, having the characteristics of Al Faruque and Joseph before the earliest effective filing date, to modify the split as disclosed by Al Faruque to include the number of sample data points as taught by Joseph.
One of ordinary skill in the art would have been motivated to modify the split of Al Faruque to include the number of sample data points of Joseph because it helps better find a more optimal ratio (Joseph, pages 531-32).
In reference to claim 4, Al Faruque does not explicitly teach the media of claim 1, wherein the characteristics of sample data points determine the partitioning of the synthetic data points.
Joseph teaches the media of claim 1, wherein the characteristics of sample data points determine the partitioning of the synthetic data points (see pages 535-36: the ratio for splitting data set into training and validation data is calculated based in part on the number of rows N in dataset; thus, as applied to Al Faruque, the ratio for splitting the synthetic data points would be based on a total number of rows of training data, i.e. a characteristic of the sample data points).
It would have been obvious to one of ordinary skill in art, having the characteristics of Al Faruque and Joseph before the earliest effective filing date, to modify the split as disclosed by Al Faruque to include the number of sample data points as taught by Joseph.
One of ordinary skill in the art would have been motivated to modify the split of Al Faruque to include the number of sample data points of Joseph because it helps better find a more optimal ratio (Joseph, pages 531-32).
In reference to claim 12, this claim is directed to a method associated with the non-transitory computer-readable media claimed in claim 2 and is therefore rejected under a similar rationale.
In reference to claim 14, this claim is directed to a method associated with the non-transitory computer-readable media claimed in claim 4 and is therefore rejected under a similar rationale.
In reference to claim 21, Al Faruque does not explicitly teach the media of claim 1, wherein: the characteristics comprise a quantity of the sample data points; and the first operation partitions the sample data points based on a comparison of the quantity of data points to a plurality of threshold values
Joseph teaches the characteristics comprise a quantity of the sample data points; and the first operation partitions the sample data points based on a comparison of the quantity of data points to a plurality of threshold values (a quantity of the sample data points, like a number of features and a number of unique rows, is used to calculation a ratios for splitting data; the ratios are threshold values).
It would have been obvious to one of ordinary skill in art, having the characteristics of Al Faruque and Joseph before the earliest effective filing date, to modify the split as disclosed by Al Faruque to include the number of sample data points as taught by Joseph.
One of ordinary skill in the art would have been motivated to modify the split of Al Faruque to include the number of sample data points of Joseph because it helps better find a more optimal ratio (Joseph, pages 531-32).
Claim(s) 7 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Al Faruque et al., (US 2023/0230484 A1) in view of O’Toole et al. (US 2019/0219716 A1).
In reference to claim 7, Al Faruque does not explicitly teach the media of claim 1, wherein: the first operation comprises allocating all data points in the plurality of sample data points into the machine learning training data; and the second operation comprises allocating all data points in the plurality of synthetic data points into the machine learning validation data.
O’Toole teaches the media of claim 1, wherein: the first operation comprises allocating all data points in the plurality of sample data points into the machine learning training data; and the second operation comprises allocating all data points in the plurality of synthetic data points into the machine learning validation data (synthetically generated data can be used as the validation set, para. 0032, so real-world well-data would be used as the training data, para. 0031-32).
It would have been obvious to one of ordinary skill in art, having the characteristics of Al Faruque and O’Toole before the earliest effective filing date, to modify the allocation as disclosed by Al Faruque to the all sample data as training and all synthetic data as validation as taught by O’Toole.
One of ordinary skill in the art would have been motivated to modify the of Al Faruque to include the all sample data as training and all synthetic data as validation of O’Toole because it is the simple substitution for one element for another with predictable results. Splitting a dataset by a fixed ratio into training data and validation data is known in the art (as taught by both Al Faruque, para. 0162, and O’Toole at para. 0031). O’Toole also teaches that as an alternative, one may use only synthetic data as the validation set and real data as the training data at para. 0032. Therefore, one having ordinary skill in the art could have substituted the one known element for the other, with the predictable result of using only synthetic data as the validation set and real data as the training data.
In reference to claim 17, this claim is directed to a method associated with the non-transitory computer-readable media claimed in claim 7 and is therefore rejected under a similar rationale.
Response to Arguments
Applicant's arguments filed 12/10/2025 have been fully considered but they are not persuasive. With respect to the 101 rejection, Applicant argues that under step 2A, prong 1, “training a machine learning model” does not recite an abstract idea, and under step 2A, prong 2, the claim as a whole reflects an improvement to the functioning of a computer. The Examiner respectfully disagrees. First, for step 2A, prong 1, the 101 rejection above does not state the training limitation is an abstract idea: the training limitation is an additional limitation that is a mere instruction to apply the exception. Second, for step 2A, prong 2, the MPEP states that:
An important consideration in determining whether a claim improves technology is the extent to which the claim covers a particular solution to a problem or a particular way to achieve a desired outcome, as opposed to merely claiming the idea of a solution or outcome. McRO, 837 F.3d at 1314-15, 120 USPQ2d at 1102-03; DDR Holdings, 773 F.3d at 1259, 113 USPQ2d at 1107. In this respect, the improvement consideration overlaps with other considerations, specifically the particular machine consideration (see MPEP § 2106.05(b)), and the mere instructions to apply an exception consideration (see MPEP § 2106.05(f)). Thus, evaluation of those other considerations may assist examiners in making a determination of whether a claim satisfies the improvement consideration.
It is important to note, the judicial exception alone cannot provide the improvement. The improvement can be provided by one or more additional elements. See the discussion of Diamond v. Diehr, 450 U.S. 175, 187 and 191-92, 209 USPQ 1, 10 (1981)) in subsection II, below. In addition, the improvement can be provided by the additional element(s) in combination with the recited judicial exception. See MPEP § 2106.04(d) (discussing Finjan, Inc. v. Blue Coat Sys., Inc., 879 F.3d 1299, 1303-04, 125 USPQ2d 1282, 1285-87 (Fed. Cir. 2018)). Thus, it is important for examiners to analyze the claim as a whole when determining whether the claim provides an improvement to the functioning of computers or an improvement to other technology or technical field.
MPEP § 2106.05(a)
As the MPEP notes, that the improvement must be provided by the additional elements in combination with the judicial exception and not the judicial exception alone, and the Examiner should consider the extent to which the claim covers a particular solution versus merely claiming an outcome in determining if there is a technical improvement. Here, the entirety of technical improvement of better few-shot learning comes from the mental process: all the steps of synthesizing and splitting data can be performed mentally by a person. The additional limitations of training and validating a ML model are mere instructions to apply the mental process because they merely claim the idea of a solution (that a ML model is trained for few-shot learning) and not how the solution is accomplished (there are no details or restrictions whatsoever on how the training and validation works -- the limitations instead cover all solutions that use the mental process). Even when the mental process is considered together with the training and validation limitations in ordered combination, the claims encompass a mental process of creating synthetic training data from real data, making mental judgments about allocating the synthetic and real training data into training and validation datasets, and then any kind of training and validation of a ML model with the datasets, without limitations or details on the types of models, or how the training and validation is performed. Therefore, the additional limitations do not integrate the abstract idea into a practical application at step 2A, prong 2 by providing a technical improvement to the functioning of a computer.
For the 103 rejection, Applicant argues that (1) Al Faruque splits the data randomly and does not determine a number of data points to allocate to the training and validation sets based on a characteristic of the data, and (2) Al Faruque does not “select a first/second operation.” First, the specification states that a “characteristic” of the data can be the total number of data points, or the class of the data points including the number of data points in a class (see paragraphs 43-45 as filed). Then, the “operation” allocates how the data points are split into training and validation datasets (see paragraphs 46-48 as filed).
For (1), Al Faruque is determining a number of data points to allocate based on a characteristic of the data because in order to split the data while maintaining the ratio of risky:safe in training and validation datasets, it must first determine how many points of data are in each class (which is a characteristic of the data), calculate an overall risky:safe ratio, and from that ratio calculate how many risky and safe datapoints go into each of the training and validation dataset so that both datasets of the same ratio of risky:safe. For (2) Al Faruque is selecting first and second operations because it is determining the ratios and how to allocate different classes of data points based on the characteristics of the data, i.e. the number of items in each class. Once the quantity of data points required is determined, then the data is randomly partitioned into training and validation data. Applicant’s argument seems to be that Al Faruque just randomly divides up the data, which would not maintain an even distribution of risky:safe data.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Andrew T. Chiusano whose telephone number is (571)272-5231. The examiner can normally be reached M-F, 10am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached at 571-272-4241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ANDREW T CHIUSANO/Primary Examiner, Art Unit 2144