DETAILED ACTION
This action is written in response to the application filed 5/30/23. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Subject Matter Eligibility
In determining whether the claims are subject matter eligible, the examiner has considered and applied the 2019 USPTO Patent Eligibility Guidelines, as well as guidance in the MPEP chapter 2106. The examiner finds that the independent claims are directed to the practical application of predicting transaction classes for incoming transaction data. Furthermore, the combination of steps performed in the recited method cannot be practically performed as a mental process.
Claim Rejections - 35 USC § 112(b) - Indefiniteness
The following is a quotation of the second paragraph of 35 U.S.C. 112:
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
Claim 18 is rejected under 35 U.S.C. 112(b), as being indefinite for failing to particularly point out and distinctly claim the subject matter which applicant regards as the invention.
Claim 18 recites: “The computer-implemented system of claim 11, wherein the classier module is configured to generate”. The phrase “the classier module” lacks antecedent basis. This should recite “the classifier module”.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.
The following are the references relied upon in the rejections below:
Agrawal (Agarwal, Sachin, et al. "A novel automated financial transaction system using natural language processing." International Conference on Advanced Machine Learning Technologies and Applications. Cham: Springer International Publishing, 2019.)
Bhavani (Jaya Bhavani, D., et al. "Hybrid Feature Selection Model for Credit Card Data Classification." Proceedings of Third International Conference on Intelligent Computing, Information and Control Systems: ICICCS 2021. Singapore: Springer Nature Singapore, 2022.)
Chandola (Chandola, Varun, Arindam Banerjee, and Vipin Kumar. "Anomaly detection: A survey." ACM computing surveys (CSUR) 41, no. 3 (2009): 1-58.)
Chew (Chew, Peter. "Unsupervised-learning financial reconciliation: a robust, accurate approach inspired by machine translation." Proceedings of the First ACM International Conference on AI in Finance. 2020.)
Prusti (Prusti, Debachudamani, and Santanu Kumar Rath. "Fraudulent transaction detection in credit card by applying ensemble machine learning techniques." In 2019 10th international conference on computing, communication and networking technologies (ICCCNT), pp. 1-6. IEEE, 2019.)
Ramani (US 20220351210 A1)
Claims 1-4, 6-8, 11-14 and 16-18 are rejected under 35 U.S.C. 103 as being unpatentable over Prusti, Agrawal, Chandola and Ramani.
Regarding claims 1 and 11, Prusti discloses a computerized method (and a computer-implemented system) for automatically classifying a selected transaction, the method comprising:
receiving, by a computing device, historical data comprising a plurality of historical transactions assigned to respective ones of a plurality of transaction classes;
P. 1, second col., “The objective of this study is to identify whether the transaction is fraudulent or not by using credit card transaction data and ultimately to find the detection rate at which the model is able to classify. The testing on the dataset has been performed on the input transaction data by choosing the classification algorithms.” (Emphasis added.)
preprocessing, by the computing device, the historical data including (i) correlating portions of the historical data to respective ones of the plurality of transaction classes …
P. 3, first col., sec. D. “Multilayer Perceptron Model”. The Examiner notes that this is a supervised algorithm which relies upon labeled training data.
generating, by the computing device, a classifier for predicting transaction classes for incoming transaction data, generating the classifier comprising:
training, by the computing device, a machine learning model for predicting transaction classes, the machine learning model being trained using the vectorized tokens and their corresponding transaction classes;
PP. 2-3, describing several different classification models, including “k-nearest neighbor”, “random forest method”, “multilayer perceptron” and “bagging classifier”.
combining, by the computing device, the heuristic layer with the machine learning model to generate the classifier; and
P. 4, fig. 3 (reproduced below), illustrating an ensemble model for combining the results from several individual machine learning models.
PNG
media_image1.png
532
612
media_image1.png
Greyscale
providing, by the computing device, data related to the selected transaction to the classifier to generate a prediction of a transaction class for assignment to the selected transaction …
Fig. 3 (reproduced above), “final predicted class”.
Agrawal discloses the following further limitation which Prusti does not disclose:
preprocessing, by the computing device, the historical data including … (ii) cleansing the historical data;
FIg. 1 (reproduced below): “Sentence Segmentation”, “Tokenization”, “POS tagging using NLTK”.
PNG
media_image2.png
656
560
media_image2.png
Greyscale
tokenizing, by the computing device, the preprocessed historical data portion for each transaction class to generate a plurality of tokens for each transaction class; …
P. 539, “3. Tokenization Each natural language transaction is then divided into constituent parts (words or phrases), called tokens. The result of this step is an array of tokens (Fig. 3)”.
Fig. 3 (reproduced below).
PNG
media_image3.png
114
470
media_image3.png
Greyscale
See also fig. 1 (reproduced above): ‘tokenization’.
generating, by the computing device, a heuristic layer for predicting transaction classes, the heuristic layer comprising a plurality of predefined prediction rules correlated to respective ones of a plurality of transaction classes; and
The Examiner interprets “heuristic layer” according to its broadest reasonable interpretation as encompassing the rule-based classification technique employed by Agarwal in the passage below. The Examiner notes that term is not defined by the Applicant in their specification.
P. 537, “Each of the accounting transactions can be of the form of various invoices, receipts, letters of intent, electricity bill, telephone bill, etc. Iswandi, Suwardi and Maulidevi, had proposed a design [9] of rules to identify the entities located on the sales invoice. There are some entities identified in a sales invoice, namely: invoice date, company name, invoice number, product id, product name, quantity and total price. The rapid identification of all these entities is rigorously done using the named entity recognition method as elaborate in [15]. The entities generated from the rules used as a basis for automation process of data input into the accounting system.”
At the time of filing, it would have been obvious to a person of ordinary skill to combine the techniques of tokenization and rule-based classification (for named entity recognition) taught by Agarwal with the fraud classification system of Prusit because both techniques can help systems make effective use of natural language data that is often readily available for financial transactions. This information, once processed, can yield improve classification decisions.
Ramani discloses the following further limitations which Prusti/Agrawal do not disclose:
vectorizing, by the computing device, the plurality of tokens associated with each transaction class to generate a set of vectorized tokens for each transaction class that comprises a plurality of normalized weights assigned to the tokens based on a frequency of occurrences of the tokens within the corresponding transaction class;
[0085] “Step 1—Transaction Tagging: Some of the information used to analyze behavior takes the form of text. Initially, the raw text may be vectorized using a character level term frequency-inverse document frequency (TF-IDF) vectorizer that works at the unigram, bigram and trigram levels.”
At the time of filing, it would have been obvious to a person of ordinary skill to combine the text-processing technique disclosed by Ramani with the Prusti/Agrawal system because the former will render usable unstructured text data for subsequent processing by a ML model. All three disclosures pertain to financial transaction processing.
Chandola discloses the following further limitations which Prusti/Agrawal/Ramani do not disclose:
providing, by the computing device, data related to the selected transaction to the classifier to generate a prediction of a transaction class for assignment to the selected transaction along with a confidence score associated with the predicted transaction class.
PP. 10-11, “2.4. Output of Anomaly Detection An important aspect for any anomaly detection technique is the manner in which the anomalies are reported. Typically, the outputs produced by anomaly detection techniques are one of the following two types:
2.4.1. Scores. Scoring techniques assign an anomaly score to each instance in the test data depending on the degree to which that instance is considered an anomaly. Thus the output of such techniques is a ranked list of anomalies. An analyst may choose to either analyze the top few anomalies or use a cutoff threshold to select the anomalies.
2.4.2. Labels. Techniques in this category assign a label (normal or anomalous) to each test instance. Scoring-based anomaly detection techniques allow the analyst to use a domain-specific threshold to select the most relevant anomalies. Techniques that provide binary labels to the test instances do not directly allow the analysts to make such a choice, though this can be controlled indirectly through parameter choices within each technique.”
At the time of filing, it would have been obvious to a person of ordinary skill to combine the anomaly score technique disclosed by Chandola with the combined system of Prusti/Agrawal/Ramani because this is an interpretable heuristic that helps the user understand the likelihood of correct classification. All four documents pertain to financial transaction processing.
Regarding claims 2 and 12, Agarwal discloses the further limitations wherein the historical data is in an unstructured text form.
P. 539, secs. 2-4: “Sentence Segmentation”, “Tokenization”, and “POS Tagging”.
Regarding claims 3 and 13, Ramani discloses the further limitations wherein cleansing the historical data comprises removing non-contextual data from the historical data, including removing at least one of symbols, predefined characters or numbers from the historical data.
[0078] “First, the algorithm performs pre-processing steps to remove noise and other inadvertent artifacts from the transactions.” (Emphasis added.)
[0096] “Step 3—Labelling Clusters: In an exemplary embodiment, the clusters of transactions generated in step 2 may be analyzed by investigators for sanity. Based on their feedback, more tags may be added to the list, and some tags may be removed, so as to capture any remaining clusters of transactions or outliers.” (Emphasis added.)
The Examiner notes that ‘tags’ comprise letters, which are symbols.
Regarding claims 4 and 14, Ramani discloses the further limitations The computerized method of claim 1, wherein tokenizing the preprocessed historical data comprises applying a unigram methodology that transforms each token into an independent feature.
[0085] “Step 1—Transaction Tagging: Some of the information used to analyze behavior takes the form of text. Initially, the raw text may be vectorized using a character level term frequency-inverse document frequency (TF-IDF) vectorizer that works at the unigram, bigram and trigram levels.”
Regarding claims 6 and 16, Ramani discloses the further limitations wherein vectorizing the plurality of tokens is performed using at least one of a term frequency-inverse document frequency (TF-IDF) vectorization approach or a count vectorization approach.
[0085] “Step 1—Transaction Tagging: Some of the information used to analyze behavior takes the form of text. Initially, the raw text may be vectorized using a character level term frequency-inverse document frequency (TF-IDF) vectorizer that works at the unigram, bigram and trigram levels.”
Regarding claims 7 and 17, Chandola discloses the further limitation comprising periodically re-train the machine learning model with new historical data.
P. 44, “[T]echniques have been proposed that can operate in an online fashion [Pokrajac et al. 2007]; such techniques not only assign an anomaly score to a test instance as it arrives, but also incrementally update the model.”
Regarding claims 8 and 18, Chandola discloses the further limitations wherein providing data related to the selected transaction to the classifier to generate the predicted transaction class for the selected transaction comprises:
first providing the data to the heuristic layer to determine if a predefined prediction rule is satisfied;
PP. 21-22, sec. 4.4 “”Rule-Based Rule-based anomaly detection techniques learn rules that capture the normal behavior of a system. A test instance that is not covered by any such rule is considered as an anomaly. Rule-based techniques have been applied in multi-class as well as one-class settings.”
(cont.) “A basic multi-class rule-based technique consists of two steps. The first step is to learn rules from the training data using a rule learning algorithm, such as RIPPER, Decision Trees, and so on. Each rule has an associated confidence value that is proportional to ratio between the number of training instances correctly classified by the rule and the total number of training instances covered by the rule. The second step is to find, for each test instance, the rule that best captures the test instance. The inverse of the confidence associated with the best rule is the anomaly score of the test instance. Several minor variants of the basic rule-based technique have been proposed [Fan et al. 2001; Helmer et al. 1998; Lee et al. 1997; Salvador and Chan 2003; Teng et al. 1990].” (Emphasis added.)
if a predefined prediction rule is satisfied, selecting the corresponding transaction class as the predicted transaction class for the selected transaction; and
Id. “Rule-based anomaly detection techniques learn rules that capture the normal behavior of a system.”
if no prediction rule in the heuristic layer is satisfied, providing the data to the machine learning model to determine the predicted transaction class for the selected transaction.
Id. “A test instance that is not covered by any such rule is considered as an anomaly.”
Claims 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Prusti, Agrawal, Chandola, Ramani and Bhavani.
Regarding claims 5 and 15, Bhavani discloses the further limitations which Prusti/Agrawal/Chandola/Ramani do not disclose comprising applying a SelectKBest algorithm to optimally reduce a number of the plurality of tokens in each transaction class.
P. 178, “v1 to v29 contains the transactions that are the principal components obtained with PCA and SelectK-Best Algorithm.”
P. 187, sec. 7 “We have taken the six different Machine Learning Classification Algorithms, namely SVM, Random Forest, XGBoost, KNN, Decision Tree, and Logistic Regression and also, we have added another feature selection along with PCA algorithm that is SelectK-Best algorithm to have better accuracy.”
At the time of filing, it would have been obvious to a person of ordinary skill to apply the SelectKBest feature selection technique (as taught by Bhavani) in combination with the Prusti/Agrawal/Chandola/Ramani system because feature selection provides reduced computing time—as well as simple, more interpretable models—without unduly affecting predictive performance.
Claims 9-10 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Prusti, Agrawal, Chandola, Ramani and Chew.
Regarding claims 9 and 19, Chew discloses the further limitation which Prusti/Agrawal/Chandola/Ramani do not disclose wherein the selected transaction represents an exception from a manual classification procedure.
The Examiner interprets ‘exception’ according to its broadest reasonable interpretation in view of the Applicant’s specification at p. 1 (“However, for transactions that are not matched by the current approach, such transactions (hereinafter referred to as "exceptions") are manually researched by analysts with domain expertise and manually reclassified.“)
P. 7, fig. 4, “Manual reconciliation”.
At the time of filing, it would have been obvious to a person of ordinary skill to combine the manual reconciliation technique disclosed by Chew with the Prusti/Agrawal/Chandola/Raman system because data labels—whether human-made or machine-made—are not infallible. By identifying and correcting errors in training data, the model can subsequently be retrained using this improved data, yielding more accurate prediction results.
Regarding claims 10 and 20, Chew discloses the further limitations wherein the historical data includes reclassifications of the historical exceptions.
The Examiner interprets ‘exception’ according to its broadest reasonable interpretation in view of the Applicant’s specification at p. 1 (“However, for transactions that are not matched by the current approach, such transactions (hereinafter referred to as "exceptions") are manually researched by analysts with domain expertise and manually reclassified.“)
P. 7, fig. 4, “Manual reconciliation”.
Additional Relevant Prior Art
The following references were identified by the Examiner as being relevant to the disclosed invention, but are not relied upon in any particular prior art rejection:
Pippenger discloses a system for enterprise-scale transaction analysis featuring, inter alia, vectorization of unstructured text and TF-IDF analysis (see eg [0050]). (US 2019/0378043 A1)
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Vincent Gonzales whose telephone number is (571) 270-3837. The examiner can normally be reached on Monday-Friday 7 a.m. to 4 p.m. MT. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang, can be reached at (571) 270-7092.
Information regarding the status of an application may be obtained from the USPTO Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov.
/Vincent Gonzales/Primary Examiner, Art Unit 2124