Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of the Claims
Claims 1 is all the claims pending in the application.
Claim 1 is amended.
Claim 1 is rejected.
The following is a Final Office Action in response to amendments and remarks filed Nov. 12, 2025.
Response to Arguments
Regarding the 103 rejection, the rejection is withdrawn because the cited references do not teach the newly amended limitations. Please see below for the new rejection of the claims as amended.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1 is/are rejected under 35 U.S.C. 103 as being unpatentable over Watson, et al. US Pub. No. 2020/0012891, herein referred to as "Watson" in view of Rossetti et al, "Basics of Entity Resolution with Python and Dedupe" District Data Labs, Jan. 3, 2018, herein referred to as “Rossetti”; further in view of Kar et al, US Pub. No. 2021/0287050, herein referred to as “Kar”, further in view of Zhou, Xiang "Interpretability Methods in Machine Learning: A Brief Survey" Two Sigma, Jun. 2, 2019, herein referred to as “Zhou”.
Regarding claim 1, Watson teaches:
receiving a full version of an unperturbed plurality of data records with each data record including a plurality of feature values respectively corresponding to a plurality of features (receives original data set including various customer information, ¶ [0025] and Fig. 1, ref. char. 105);
and creating a perturbed version of the reduced version of the plurality of data records by modifying at least some of the feature values of the perturbed version of the plurality of data records (randomly generates fictional information similar to the information contained within original data and generates fictitious biographical information based on real information, ¶ [0027]);
determining, using a machine learning model, outcomes for each of the plurality of data records and the perturbed version of the plurality of data records (trains first model with the original dataset and trains second model with the synthetic dataset, e.g. ¶¶[0007], [0032]; see also ¶[0058] discussing comparing results of training the first and second models);
and based on the identification of the arbitrary decision indicator, canceling an outcome for an individual record from the unperturbed plurality of data records, and using the individual record as input for further training of the machine learning model (if summed error is greater than threshold, generates new synthetic data, ¶¶[0032], [0060] and repeats process, Fig. 1; see also Figs. 8-10 showing process for evaluating synthetic data).
However, Watson does not teach but Rossetti does teach:
for each given feature of the plurality of features associated with a given data record from the unperturbed plurality of data records, determining a correlation value between the given feature and each other feature of the plurality of features based on the feature values of the unperturbed plurality of data records (dedupe removes duplicate entries, pg. 2 of PDF provided with previous Office Action, by determining distance between entries, pgs. 3-5 of PDF provided with previous Office Action);
determining that a first feature and a second feature associated with the plurality of features are a duplicative pair of features based on the correlation value between the first feature and the second feature exceeding a threshold (dedupe removes duplicate entries, pg. 2 of PDF provided with previous Office Action, based on threshold, pgs. 3-5 of PDF provided with previous Office Action);
responsive to the determination that the correlation value exceeds the threshold, creating a reduced version of the unperturbed plurality of data records by removing the first feature or the second feature from the duplicative pair of features such that only one feature remains from the duplicative pair of features (dedupe removes duplicate entries, pg. 2 of PDF provided with previous Office Action);
Further, it would have been obvious before the effective filing date of the claimed invention, to combine the customer behavior modelling of Watson with the entity resolution, as taught by Rossetti, because known work in one field of endeavor may prompt variations of it for use in the same field based on design incentives, see MPEP 2143.I.F. That is, one of ordinary skill would have recognized large data sets (e.g., the customer data in Watson) would likely have multiple, redundant entries and accordingly would have modified Watson to use entity resolution to remove these duplicate entries.
However the combination of Watson and Rossetti does not teach but Kar does teach:
based on the determined outcomes, identifying whether the machine learning model is biased based on identification of an arbitrary decision indicator (tests ML model to determine if it produces an inaccurate output, e.g., ¶¶[0017], [0053]; see also e.g., ¶¶[0028], [0029] discussing modifying data samples for testing),
wherein identifying the arbitrary decision indicator further comprises identifying an outcome of the at least one individual record and identifying, for the perturbed version of the at least one individual record having a different outcome, at least one feature that was modified and a value of the at least one feature (obtains successful and unsuccessful test results for each modified data samples provided to the ML model to test, ¶[0030]),
and determining that the machine learning model is biased based on the at least one feature bias in response to the machine learning model generating the different outcome for the at least one individual record based on modification of the value associated with the at least one feature (tests ML model to determine if it produces an inaccurate output, e.g., ¶¶[0017], [0053]);
and based on the identification of the arbitrary decision indicator, canceling the outcome for the at least one individual record from the unperturbed plurality of data records, and using at least one the individual record as input for further training of the machine learning model (if the test is a failure a suggestion is generated for further training of the ML model under test,¶¶[0017], [0021]).
Further, it would have been obvious before the effective filing date of the claimed invention, to combine the customer behavior modelling with the entity resolution of Watson and Rossetti with the testing of machine learning models based on modified data, as taught by Kar because known work in one field of endeavor may prompt variations of it for use in the same field based on design incentives, see MPEP 2143.I.F. That is, one of ordinary skill would have recognized the synthetic data taught by Watson would not only be useful for protecting individuals’ privacy but also would be useful for generating synthetic training and testing data for machine learning models (e.g., in situations where there is insufficient data for properly training or testing the model) and accordingly would have modified Watson and Rossetti to use the synthetic data to test machine learning models (i.e., to generate synthetic testing data), as taught by Kar.
However the combination of Watson, Rossetti and Kar does not teach but Zhou does teach:
identifying whether the machine learning model is biased with respect to a specific feature of at least one individual record based on identification of an arbitrary decision indicator (determines specific feature impact on model error, pg. 5 of PDF provided with Office Action)
Further, it would have been obvious before the effective filing date of the claimed invention, to combine the customer behavior modelling with the entity resolution of Watson, Rossetti and Kar with the interpretability of Zhou because Zhou explicitly teaches interpretability is crucial for guarding against bias or debugging, pgs. 1-2 of PDF provided with Office Action; see also MPEP 2143.I.G.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRENDAN S O'SHEA whose telephone number is (571)270-1064. The examiner can normally be reached Monday to Friday 10-6.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Nathan Uber can be reached at (571) 270-3923. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/BRENDAN S O'SHEA/Examiner, Art Unit 3626