Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Detailed Action
This action is in response to the claims filed 4/6/2023:
Claims 1 – 24 are pending.
Claims 1 and 22 are independent.
Drawings
The drawings are objected to because FIGs 3-6, 8-11, and 13-17 are low quality scans containing illegible elements. Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Specification
The disclosure is objected to because of the following informalities:
The title of the invention is not descriptive. A new title is required that is clearly indicative of the invention to which the claims are directed.
Claim Objections
Claim 20 is objected to because of the following informalities: "The method according to claim 19 acquiring further comprises" should read "The method according to claim 19 wherein the acquiring further comprises". Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-24 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Regarding claim 1, "the AI classifier" lacks antecedent basis. Claim 1 introduces "one or more AI classifier" such that it's unclear which AI classifier "the AI classifier" refers to. "The one or more AI classifier" is recommended.
Claims 10 and 11 are indefinite. It's unclear how a numerical score (singular) can simultaneously have multiple probabilities as required by the claim. In the interest of further examination the claim is interpreted as "the aggregate negative/positive probability threshold is selected from one of the following:".
Regarding claim 17, "processing for analyzing results obtained from at least one of: Al classifiers results", appears circular and indefinite. It would be unclear to one of ordinary skill in the art what the scope of analyzing results from AI classifier results is. For example, are the analyzed results from further post-processing of the AI classifier results, or are the AI classifier results the results being analyzed. This is further complicated as the remaining elements in the list appear to be raw inputs with respect to the instant specification. In the interest of further Examination the AI classifier results are interpreted as the results being analyzed.
Regarding claim 21, "The method according to claim 1, examining further comprises at least one of: a user interface, and a system interface." is indefinite. Specifically it's unclear how the method of examining can comprise structural components (a user interface or system interface). In the interest of further Examination the claim is interpreted as "The method according to claim 1, the examining being performed on at least one of: a user interface, and a system interface.".
The remaining claims are rejected with respect to their dependence on the rejected claims.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-5, 9, and 12-24 are rejected under U.S.C. §102(a)(1) as being unpatentable over the combination of Williams (US20150254555A1).
Regarding claim 1, Williams teaches A method for building and training at least one Artificial Intelligence (AI) classifier ([¶0067] "Network computer 300 includes one or more processor devices, such as, processor 302" [¶0075] "Applications 314 may also include, web server 316, machine learning engine 318, interactive tuning application 321, or the like." [¶0090] "Data Manager 506 which may be responsible for sorting and storing the data in the corpus the that data belongs to")
for detecting an indicium of at least one of: a disease, a condition, and a feature in a digital file, the method comprising:([¶0025] "data may be provided to a deep learning model that has been trained using a plurality of classifiers and one or more sets of training data and/or testing data." [¶0203] "the system may be arranged to analyze medical images" [¶0205] " the system may be used for classification purposes, for instance to determine if a patient has one or more disease conditions" [¶0206] "DLNN Model(s) 518 may be configured to learn and predict the pixel locations of anatomical structures across the entire input image")
assembling a positive data set and obtaining positive evaluation results by processing the positive data set by one or more AI classifier thereby training the one or more AI classifier for positive data;([¶0091] "Data Manager 506 may be arranged to separate the data into the Training Corpus database 508 and Testing Corpus database 510. [...] the data may be pre-classified in full or in part (either by a human or programmatically) for the training and performance evaluation phases of the system [...] A pre-labeled dataset may be one where known classifications are applied as labels. Returning to the e-discovery example, this may be a small subset of the full document population that has undergone human review and been categorized as responsive or non-responsive according to the discovery request" [¶0103] "One common method of evaluating the performance of a DLNN model or other machine learning model is to analyze confusion matrices and assess the quantity of True Positive, False Positive, True Negative, and False Negative classifications for each class" [¶0179] " Training and Testing Data is categorized as safe or malicious based on whether the data is known to be part of commonly used software source distributions that are considered safe, or have been evaluated as safe or malicious by a Domain Expert" "Responsive" or "safe" subset of the full document population interpreted as positive data set. The positive evaluation results are interpreted as the output values attached to said positive (responsive) examples and consumed by the training process during training. Positive evaluation results could also be interpreted as True Positive evaluation results.)
assembling a negative data set and obtaining negative evaluation results by processing the negative data set by the AI classifier thereby training the AI classifier for negative data;([¶0091] "Data Manager 506 may be arranged to separate the data into the Training Corpus database 508 and Testing Corpus database 510. [...] the data may be pre-classified in full or in part (either by a human or programmatically) for the training and performance evaluation phases of the system [...] A pre-labeled dataset may be one where known classifications are applied as labels. Returning to the e-discovery example, this may be a small subset of the full document population that has undergone human review and been categorized as responsive or non-responsive according to the discovery request" [¶0179] " Training and Testing Data is categorized as safe or malicious based on whether the data is known to be part of commonly used software source distributions that are considered safe, or have been evaluated as safe or malicious by a Domain Expert" "Non-responsive" or "malicious" subset of the full document population interpreted as negative data set. The negative evaluation results are interpreted as the output values attached to said negative (non-responsive) examples and consumed by the training process during training. Negative evaluation results could also be interpreted as True Negative evaluation results.)
analyzing a test data set by the one or more AI classifier to obtain test evaluation results ([¶0025] "data may be provided to a deep learning model that has been trained using a plurality of classifiers and one or more sets of [...] testing data." [¶0203] "the system may be arranged to analyze medical images" [¶0205] " the system may be used for classification purposes, for instance to determine if a patient has one or more disease conditions" [¶0206] "DLNN Model(s) 518 may be configured to learn and predict the pixel locations of anatomical structures across the entire input image")
and sorting the test evaluation results by at least one probability threshold to obtain at least one sorted results; and([¶0102] "Scoring Process 522 assigns a score to incoming data, ranking said data as a member of a class (or label), or as an anomalous data point. Runtime scoring delivers new data to the Scoring Process and makes those results available to the Domain Expert Analysis component 530. Testing scoring delivers testing data to Scoring Process 522 and delivers those results and the known classifications to the Model Performance Analysis component 524, which is used to calculate and evaluate performance metrics." [¶0175] "Combination Function 520 analyzes the scores predicted by both models in combination with the confidence and performance of each model, finally selecting the class representing the highest probability of accuracy." [¶0202] " if a claim score is below the threshold for automated adjudication, then Model(s) 518 output may be used as advisory information presented to human Domain Experts who are tasked with adjudicating the claim manually in Decision Process 536 using User Interface 532")
examining the sorted results to identify incorrectly sorted results and retraining by reanalyzing the one or more AI classifier for the incorrectly sorted results thereby building and training the AI classifier.([¶0167] " in the example document of FIG. 13, the file displayed has a path of “/user1/public_html/products/submit.php” and received a predicted maliciousness score of 0.85. In this example, a Domain Expert has reviewed this file and marked it as being malicious, agreeing with the system's prediction. If an expert disagrees with and reverses the predicted class of a data element, they may mark that element 1 as the correct class and submit it to be part of the Training Corpus 508. This may be referred to as a “labeling conflict.” The submission of a document into the Training Corpus 508 after a labeling conflict may be performed automatically or with confirmation from a human user. However, submission of data into the Training Corpus 508 does not immediately increase the performance of the system. After a re-training of the model with the updated Training Corpus 508 the Domain Expert's refinement may be incorporated, but methods are available to reduce the amount of time required to re-train including using a combination of models that differing amounts of time and data sizes to train." Williams thresholded, scored, and ranked outputs are explicitly examined by domain experts who examine the results. When a domain expert identifies a mistake they explicitly "reverse the predicted class" and resubmit it to the training corpus for retraining the model).
Regarding claim 2, Williams teaches The method according to claim 1, the positive data set comprises a plurality of positive digital files.(Williams [¶0179] "the Training Corpus 508 and Testing Corpus 510 may be populated with data ingested 504 from webpage source code on a per-file basis […] Training and Testing Data is categorized as safe or malicious based on whether the data is known to be part of commonly used software source distributions that are considered safe, or have been evaluated as safe or malicious by a Domain Expert" Williams explicitly ingests files categorized as safe (positive) or malicious (negative)).
Regarding claim 3, Williams teaches The method according to claim 1, the negative data set comprises a plurality of negative digital files.(Williams [¶0179] "the Training Corpus 508 and Testing Corpus 510 may be populated with data ingested 504 from webpage source code on a per-file basis […] Training and Testing Data is categorized as safe or malicious based on whether the data is known to be part of commonly used software source distributions that are considered safe, or have been evaluated as safe or malicious by a Domain Expert" Williams explicitly ingests files categorized as safe (positive) or malicious (negative)).
Regarding claim 4, Williams teaches The method according to claim 2, the plurality of positive digital files further comprises presence of the indicium of at least one of: the disease, the condition, and the feature.(Williams [¶0179] "the Training Corpus 508 and Testing Corpus 510 may be populated with data ingested 504 from webpage source code on a per-file basis […] Training and Testing Data is categorized as safe or malicious based on whether the data is known to be part of commonly used software source distributions that are considered safe, or have been evaluated as safe or malicious by a Domain Expert" [¶0183] " If a malicious page is scored with a high degree of confidence, and a malicious code class label is also scored with a high-degree of confidence, Decision Process 536 may apply a set of adjustable confidence thresholds, and decide whether to automatically remediate the malicious code. Remediation actions may vary based on the malicious code class label." Williams is explicit about the data feature (malicious code) being present in a file determined to be safe or malicious).
Regarding claim 5, Williams teaches The method according to claim 3, the plurality of negative digital files further comprises absence of the indicium of at least one of: the disease, the condition, and the feature.(Williams [¶0179] "the Training Corpus 508 and Testing Corpus 510 may be populated with data ingested 504 from webpage source code on a per-file basis […] Training and Testing Data is categorized as safe or malicious based on whether the data is known to be part of commonly used software source distributions that are considered safe, or have been evaluated as safe or malicious by a Domain Expert" Files categorized as safe interpreted as files comprising absence of the indicium of the data feature (malicious code)).
Regarding claim 9, Williams teaches The method according to claim 1, the probability threshold is selected from: a negative probability threshold, a positive probability threshold, an aggregate positive probability threshold, and an aggregate negative probability threshold.(Williams [¶0102] "Scoring Process 522 assigns a score to incoming data, ranking said data as a member of a class (or label), or as an anomalous data point. Runtime scoring delivers new data to the Scoring Process and makes those results available to the Domain Expert Analysis component 530. Testing scoring delivers testing data to Scoring Process 522 and delivers those results and the known classifications to the Model Performance Analysis component 524, which is used to calculate and evaluate performance metrics." [¶0175] "Combination Function 520 analyzes the scores predicted by both models in combination with the confidence and performance of each model, finally selecting the class representing the highest probability of accuracy." [¶0202] " if a claim score is below the threshold for automated adjudication, then Model(s) 518 output may be used as advisory information presented to human Domain Experts who are tasked with adjudicating the claim manually in Decision Process 536 using User Interface 532").
Regarding claim 12, Williams teaches The method according to claim 1, the test data set further comprises a plurality of test digital files.(Williams [¶0091] "separate the data into the Training Corpus database 508 and Testing Corpus database 510" [¶0164] "The new data table 1200 is made up of rows 1202-1210, each representing a file that was processed through the system" [¶0167] "a Domain Expert has reviewed this file and marked it as being malicious, agreeing with the system's prediction. If an expert disagrees with and reverses the predicted class of a data element, they may mark that element 1 as the correct class and submit it to be part of the Training Corpus 50").
Regarding claim 13, Williams teaches The method according to claim 12, the test digital files further comprise a plurality of positive test digital files and a plurality of negative test digital files.(Williams [¶0179] "the Training Corpus 508 and Testing Corpus 510 may be populated with data ingested 504 from webpage source code on a per-file basis [...] Training and Testing Data is categorized as safe or malicious based on whether the data is known to be part of commonly used software source distributions that are considered safe, or have been evaluated as safe or malicious by a Domain Expert").
Regarding claim 14, Williams teaches The method according to claim 13, the positive test digital files have the presence of indicium of at least one of: the disease, the condition, and the feature.(Williams [¶0179] "the Training Corpus 508 and Testing Corpus 510 may be populated with data ingested 504 from webpage source code on a per-file basis [...] Training and Testing Data is categorized as safe or malicious based on whether the data is known to be part of commonly used software source distributions that are considered safe, or have been evaluated as safe or malicious by a Domain Expert" [¶0016] "FIG. 12 is a table diagram showing the sample results of a real-time scoring of website files being analyzed for malicious code; and" [¶0017] "FIG. 13 is a display diagram depicting the assessment component of a file that has been scanned for malicious code in accordance with at least one of the various embodiments.").
Regarding claim 15, Williams teaches The method according to claim 13, the negative test digital files have the absence of indicium of at least one of: the disease, the condition, and the feature.(Williams [¶0179] "the Training Corpus 508 and Testing Corpus 510 may be populated with data ingested 504 from webpage source code on a per-file basis [...] Training and Testing Data is categorized as safe or malicious based on whether the data is known to be part of commonly used software source distributions that are considered safe, or have been evaluated as safe or malicious by a Domain Expert" [¶0016] "FIG. 12 is a table diagram showing the sample results of a real-time scoring of website files being analyzed for malicious code; and" [¶0017] "FIG. 13 is a display diagram depicting the assessment component of a file that has been scanned for malicious code in accordance with at least one of the various embodiments.").
Regarding claim 16, Williams teaches The method according to claim 1, the digital file is a format selected from at least one of: an image, a waveform, a genomic file, a metadata, a report, and a written template obtained from a subject.(Williams [¶0203] "the system may be arranged to analyze medical images [...] the Training Corpus 508 and Testing Corpus 510 may be populated with data ingested from imaging sources. Imaging source data may be from sources including, but not limited to, image formats common for the types of images being analyzed, for example, DICOM formats for MRI data, and converted to a pixel or voxel matrix.").
Regarding claim 17, Williams teaches The method according to claim 1 further comprising processing for analyzing results obtained from at least one of: AI classifiers results, medical images, non-medical images, medical report data, including words, phrases, sentences, medical laboratory data, medical waveforms such as electrocardiograph, electroencephalograph and electromyograph, radiologic images, genetic data.(Williams [¶0164] "the sample results of a real-time scoring of website files being analyzed for malicious code" [¶0203] "the system may be arranged to analyze medical images" [¶0205] "The Scoring Process 522 may produce a predicted set of diseases" [¶0213] "Classifiers trained to detect known patterns indicating unauthorized usage, such as insider threat behavior or known examples of malicious attacks." [¶0224] "Model(s) 518 may be trained with data from the Training Corpus 508 to classify the pictured objects using the general category, model type and brand labels. Model(s) 518 may be configured using a combination of convolutional layers and layers designed for classification").
Regarding claim 18, Williams teaches The method according to claim 17, the images are photographs.(Williams [¶0055] "Video interface 246 is arranged to capture video images, such as a still photo, a video segment, an infrared video, or the like" [¶0084] "Data may be any medium, including but not limited to, hardcopy documents that have been scanned, photographs, digital files and media, sensor data, log files, survey data, database records, program code, or the like." [¶0223] "The Training Corpus 508 and Testing Corpus 510 may be populated using digital photographs" [¶0231] "it is desirable for the system to recognize photographs of items that are inappropriate and automatically prohibit the user from submitting an incorrectly described or inappropriate item").
Regarding claim 19, Williams teaches The method according to claim 1 further comprising acquiring at least one of: the positive data set, the negative data set, and the test data set.(Williams [¶0088] "data may be ingested into the system and prepared for processing" [¶0091] "During the initialization of the system, Data Manager 506 may be arranged to separate the data into the Training Corpus database 508 and Testing Corpus database 510").
Regarding claim 20, Williams teaches The method according to claim 19 acquiring further comprises extracting at least one of: the positive data set, the negative data set, and the test data set from a database library.(Williams [¶0242] "Training Corpus 508 and Testing Corpus 510 are populated with image data extracted from exemplar documents and image files that are provided by an organization as examples of sensitive content. Data Ingestion 504 extracts images and diagrams that are embedded within documents and databases").
Regarding claim 21, Williams teaches The method according to claim 1, examining further comprises at least one of: a user interface, (Williams [¶0182] "Pages that are scored as malicious may be logged and made available for audit or Domain Expert examination using User Interface 532" [¶0183] "if anomalous pages are detected, those pages are also made available in User Interface 532")
and a system interface.(Williams [¶0201] "Decision 536 may be used to automatically adjudicate the claim, sending the claim decision to a claims payment and recording system using mechanisms including, but not limited to, database records or application programming interface calls" API interpreted as a system interface).
Regarding claim 22, Williams teaches A system programmed to train one or more Artificial Intelligence (AI) classifiers by the method of claim 1, the system comprising: at least one AI processor; and(Williams [¶0067] "Network computer 300 includes one or more processor devices, such as, processor 302" [¶0075] "Applications 314 may also include, web server 316, machine learning engine 318, interactive tuning application 321, or the like." [¶0090] "Data Manager 506 which may be responsible for sorting and storing the data in the corpus the that data belongs to")
a display device.(Williams [¶0502] "Display 240 may be a liquid crystal display (LCD), gas plasma, light emitting diode (LED), organic LED, or any other type of display used with a computer. Display 240 may also include a touch sensitive screen arranged to receive input from an object such as a stylus or a digit from a human hand.").
Regarding claim 23, Williams teaches The system according to claim 22 further comprising a user interface and/or a system interface.(Williams [¶0182] "Pages that are scored as malicious may be logged and made available for audit or Domain Expert examination using User Interface 532" [¶0183] "if anomalous pages are detected, those pages are also made available in User Interface 532" [¶0201] "Decision 536 may be used to automatically adjudicate the claim, sending the claim decision to a claims payment and recording system using mechanisms including, but not limited to, database records or application programming interface calls" API interpreted as a type of system interface).
Regarding claim 24, Williams teaches The system according to claim 22 further comprising at least one database library.(Williams [¶0242] "Training Corpus 508 and Testing Corpus 510 are populated with image data extracted from exemplar documents and image files that are provided by an organization as examples of sensitive content. Data Ingestion 504 extracts images and diagrams that are embedded within documents and databases").
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 6, 7, 10, and 11 are rejected under U.S.C. §103 as being unpatentable over the combination of Williams and Herbei (“Classification with reject option”, 2005).
Regarding claim 6, Williams teaches The method according to claim 1 further comprising after retraining, performing iterations of the steps of sorting, examining, (Williams [¶0183] "During Domain Expert Analysis 530, suspected malicious webpages are reviewed by Domain Experts, and if appropriate, supplied a class label for the type of malicious code detected. From time 1 to time, this data is included back into the Training Corpus 508 and revised versions of the Models 518 may be trained using the new data, based on Retraining Decision 119. If a malicious page is scored with a high degree of confidence, and a malicious code class label is also scored with a high-degree of confidence, Decision Process 536 may apply a set of adjustable confidence thresholds, and decide whether to automatically remediate the malicious code. Remediation actions may vary based on the malicious code class label." [¶0197] "Thresholds and rules may be tuned to maximize the efficiency gained by automation decision").
However, Williams doesn't explicitly teach and retraining the one or more AI classifier by a series of decreasing probability thresholds thereby obtaining a positive AI classifier or a group of positive AI classifiers..
Herbei, in the same field of endeavor, teaches and retraining the one or more AI classifier by a series of decreasing probability thresholds thereby obtaining a positive AI classifier or a group of positive AI classifiers.([p. 4] "we can restrict ourselves to the cases 0 ≤ d ≤ 1/2 and we denote the relevant risk function […] the Bayes rule (5) simplifies to […] 1 if n(x) > 1-d" [p. 21] "We split the data D into a training set D1 and D2, each of size 50 and consider, separately, two choices for d: d = .25 and d = .50. For each pair (k,h), we use the training data D1 to estimate η and the testing data D2 to estimate the risk" See also FIG. 1 where the series d, 1-d is interpreted as a series of decreasing probability (Bayes) thresholds to obtain a positive AI classifier n(x). See also FIG. 7 where Herbei varies d).
Williams as well as Herbei are directed towards machine learning classification. Therefore, Williams as well as Herbei are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to substitute Williams generic threshold strategy with the mathematically precise thresholding strategy in Herbei. Herbei explicitly extends the threshold to medical classification and provides as additional motivation for combination ([Abstract] “We extend the mathematical framework even further by differentiating between costs as sociated with the two possible errors: predicting f(X) = 0 whilst Y = 1 and predicting f(X) = 1 whilst Y = 0. Such situations are common in, for instance, medical studies where misclassifying a sick patient as healthy is worse than the opposite.”). This motivation for combination also applies to the remaining claims which depend on this combination.
Regarding claim 7, Williams teaches The method according to claim 1 further comprising after retraining, performing iterations of the steps of sorting, examining, (Williams [¶0183] "During Domain Expert Analysis 530, suspected malicious webpages are reviewed by Domain Experts, and if appropriate, supplied a class label for the type of malicious code detected. From time 1 to time, this data is included back into the Training Corpus 508 and revised versions of the Models 518 may be trained using the new data, based on Retraining Decision 119. If a malicious page is scored with a high degree of confidence, and a malicious code class label is also scored with a high-degree of confidence, Decision Process 536 may apply a set of adjustable confidence thresholds, and decide whether to automatically remediate the malicious code. Remediation actions may vary based on the malicious code class label." [¶0197] "Thresholds and rules may be tuned to maximize the efficiency gained by automation decision").
However, Williams doesn't explicitly teach and retraining the one or more AI classifier by a series of increasing probability thresholds thereby obtaining a negative AI classifier or a group of negative AI classifiers.
Herbei, in the same field of endeavor, teaches and retraining the one or more AI classifier by a series of increasing probability thresholds thereby obtaining a negative AI classifier or a group of negative AI classifiers.([p. 4] "we can restrict ourselves to the cases 0 ≤ d ≤ 1/2 and we denote the relevant risk function […] the Bayes rule (5) simplifies to […] 1 if n(x) > 1-d" [p. 21] "We split the data D into a training set D1 and D2, each of size 50 and consider, separately, two choices for d: d = .25 and d = .50. For each pair (k,h), we use the training data D1 to estimate η and the testing data D2 to estimate the risk" See also FIG. 1 where the series d, 1-d is interpreted as a series of increasing probability (Bayes) thresholds to obtain a negative AI classifier n(x)).
Williams as well as Herbei are directed towards machine learning classification. Therefore, Williams as well as Herbei are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to substitute Williams generic threshold strategy with the mathematically precise thresholding strategy in Herbei. Herbei explicitly extends the threshold to medical classification and provides as additional motivation for combination ([Abstract] “We extend the mathematical framework even further by differentiating between costs as sociated with the two possible errors: predicting f(X) = 0 whilst Y = 1 and predicting f(X) = 1 whilst Y = 0. Such situations are common in, for instance, medical studies where misclassifying a sick patient as healthy is worse than the opposite.”). This motivation for combination also applies to the remaining claims which depend on this combination.
Regarding claim 10, Williams teaches The method according to claim 9.
However, Williams doesn't explicitly teach the aggregate positive probability threshold is selected for the numeric score having: 99% probability, 95% probability, 90% probability, 85% probability, 80% probability, 75% probability, 70% probability, 65% probability, 60% probability, 55% probability, and 50% probability.
Herbei, in the same field of endeavor, teaches The method according to claim 9, the aggregate positive probability threshold is selected for the numeric score having: 99% probability, 95% probability, 90% probability, 85% probability, 80% probability, 75% probability, 70% probability, 65% probability, 60% probability, 55% probability, and 50% probability. ([p. 4] "we can restrict ourselves to the cases 0 ≤ d ≤ 1/2 and we denote the relevant risk function […] the Bayes rule (5) simplifies to […] 1 if n(x) > 1-d" Herbei explicitly restricts d to less than or equal to 50% so 1-d therefore must be greater than 50%).
Williams as well as Herbei are directed towards machine learning classification. Therefore, Williams as well as Herbei are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to substitute Williams generic threshold strategy with the mathematically precise thresholding strategy in Herbei. Herbei explicitly extends the threshold to medical classification and provides as additional motivation for combination ([Abstract] “We extend the mathematical framework even further by differentiating between costs as sociated with the two possible errors: predicting f(X) = 0 whilst Y = 1 and predicting f(X) = 1 whilst Y = 0. Such situations are common in, for instance, medical studies where misclassifying a sick patient as healthy is worse than the opposite.”). This motivation for combination also applies to the remaining claims which depend on this combination.
Regarding claim 11, Williams teaches The method according to claim 9.
However, Williams doesn't explicitly teach the aggregate negative probability threshold is selected for the numeric score having: 49% probability, 45% probability, 40% probability, 35% probability, 30% probability, 25% probability, 20% probability, 15% probability, 10% probability, 5% probability, and 0% probability.
Herbei, in the same field of endeavor, teaches The method according to claim 9, the aggregate negative probability threshold is selected for the numeric score having: 49% probability, 45% probability, 40% probability, 35% probability, 30% probability, 25% probability, 20% probability, 15% probability, 10% probability, 5% probability, and 0% probability.([p. 4] "we can restrict ourselves to the cases 0 ≤ d ≤ 1/2 and we denote the relevant risk function […] the Bayes rule (5) simplifies to […] 1 if n(x) > 1-d" Herbei explicitly restricts d to less than or equal to 50% so 1-d therefore must be greater than 50%).
Williams as well as Herbei are directed towards machine learning classification. Therefore, Williams as well as Herbei are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to substitute Williams generic threshold strategy with the mathematically precise thresholding strategy in Herbei. Herbei explicitly extends the threshold to medical classification and provides as additional motivation for combination ([Abstract] “We extend the mathematical framework even further by differentiating between costs as sociated with the two possible errors: predicting f(X) = 0 whilst Y = 1 and predicting f(X) = 1 whilst Y = 0. Such situations are common in, for instance, medical studies where misclassifying a sick patient as healthy is worse than the opposite.”). This motivation for combination also applies to the remaining claims which depend on this combination.
Claim 8 is rejected under U.S.C. §103 as being unpatentable over the combination of Williams and Montague (“Relevance Score Normalization for Metasearch”, 2001).
Regarding claim 8, Williams teaches The method according to claim 1.
However, Williams doesn't explicitly teach further comprising prior to sorting, transforming the test evaluation results to a numeric score having a normalized distribution across a defined range.
Montague, in the same field of endeavor, teaches The method according to claim 1 further comprising prior to sorting, transforming the test evaluation results to a numeric score having a normalized distribution across a defined range.([p. 2] "Scores ws. Ranks: The final desired output of a search system (or for that matter a metasearch system) is usually a ranked list of documents, in more-relevant to less-relevant order. But usually relevance scores are computed for each document first, from which the rankings are then derived. If these “intermediate” relevance scores from each input system are available to the metasearch system, it may be advantageous, as they generally contain more information than the mere rankings: the ranked ordering can be computed from the relevance scores, but not vice-versa. In this paper, we assume that the metasearch algorithms are always given access to the underlying input systems’ relevance scores" [p. 4] "Shift invariant: Let R be a set of relevance scores and R, be R shifted by an additive constant c. That is, for scr(o) E R, scr(u,) = scr(u) + c E R,. Let scr’(o) denote the normalized score of document a. Then we say that a normalization scheme is shift invariant if s&(o) = scr’(a,); both the shifted and unshifted set of scores normalize to the same set. In other words, we would like our normalization scheme to be insensitive to mere shifts of the input." Montague explicitly performs the normalization transformation prior to sorting/ranking).
Williams as well as Montague are directed towards document retrieval, scoring, and ranking. Therefore, Williams as well as Montague are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Williams with the teachings of Montague by normalizing document scores before ranking/sorting. Montague provides as additional motivation for combination ([p. 6] “By simply using more robust statis tics than max and min in the normalization scheme we can achieve significant improvements”).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Laks (US20160350675A1) is directed towards training an AI classifier on positive and negative datasets, running a test set, sorting results by probability threshold, examining sorted results to find errors, and retraining.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720. The examiner can normally be reached M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SIDNEY VINCENT BOSTWICK/Examiner, Art Unit 2124