Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statements (IDS) submitted on 08/20/2024 and 06/05/2025 are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph:
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are: “identifying, by a server computing device, a review image depicting a physical identification document to be validated, the physical identification document comprising one or more areas of interest each associated with a fraud signal”, “retrieving, by the server computing device for the review image, a dataset comprising a plurality of reference images, each reference image depicting a reference physical identification document, including selecting one or more reference images for inclusion in the dataset based upon a fraud detection prediction for the reference images as determined by a machine learning classification model”, “aligning, by the server computing device, one or more document features depicted in the review image and the plurality of reference images and crop each aligned image according to one of the areas of interest, including determining a reference pose based upon the review image and transforming the plurality of reference images according to the reference pose”, “identifying, by the server computing device, a fraud detection question for the cropped review image based upon the fraud signal associated with the area of interest in the cropped review image”, “displaying, by the server computing device, the cropped review image, the cropped reference images, and the fraud detection question in a user interface on an endpoint computing device”, “receiving, by the server computing device, a response to the fraud detection question from the endpoint computing device”, and “labeling, by the server computing device, the review image as genuine or fraudulent based upon the response when the accuracy of the fraud detection question is above a predetermined threshold”, in claim 9, and “identifying, by a server computing device, a review image depicting a physical identification document to be validated, the physical identification document comprising one or more areas of interest each associated with a fraud signal”, “generating, by the server computing device for the review image, a dataset comprising a plurality of reference images, each reference image depicting a reference physical identification document, including selecting one or more reference images for inclusion in the dataset based upon a fraud detection prediction for the reference images as determined by a machine learning classification model”, “aligning, by the server computing device, one or more document features depicted in the review image and the plurality of reference images and crop each aligned image according to one of the areas of interest, including determining a reference pose based upon the review image and transforming the plurality of reference images according to the reference pose”, “associating, by the server computing device, a fraud detection question for the cropped review image based upon the fraud signal associated with the area of interest in the cropped review image”, “displaying, by the server computing device, the cropped review image, the cropped reference images, and the fraud detection question in a user interface to test users at a plurality of endpoint computing devices”, “receiving, by the server computing device, responses to the fraud detection question from each of the endpoint computing devices”, “determining, by the server computing device, an accuracy of the fraud detection question based upon the responses”, and “labeling, by the server computing device, the review image as genuine or fraudulent based upon the responses when the accuracy of the fraud detection question is above a predetermined threshold”, in claim 19.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph.
The examiner suggests to amend claims 9 and 19 to recite the structure of the “server computing device” provided in claims 1 and 17.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-4, 6-12, and 14-20 are rejected under 35 U.S.C. 103 as being unpatentable over Addison et al. (US 20200184212 A1) in view of Mathew et al. ("DocVQA: A Dataset for VQA on Document Images", 2021; Copy provided by examiner).
The examiner would like to point out that the various “server computing device” referenced in claim 9 and 19 are being interpreted under 35 U.S.C. 112(f) as described in ¶ [015] of the disclosure as “a memory for storing computer-executable instructions and a processor that executes the computer executable instructions” and as depicted in Fig. 1.
Fig. 1 is a diagram of the system including a “server computing device 106 that includes processor 107a, memory 107b, disk storage 107c, application modules 108 (including user interface (UI) module 108a, image preparation module 108b, and question generation module 108c), and machine learning (ML) classification model 109” (¶ [033]), coupled to a communications network.
Regarding Claims 1:
Addison et al. teaches: A system for deploying operational fraud detection training data for physical identification documents (Abstract “A system and method to detect fraudulent documents”; ¶ [0005] “can be used to train a fraud classifier for deployment in document validation workflows”; ¶ [0009] “system can be trained for fraud detection of different document types, and therefore, can be used across different domains”), the system comprising a server computing device with a memory for storing computer-executable instructions and a processor that executes the computer-executable instructions to (¶ [0030]):
identify a review image depicting a physical identification document to be validated (Addison et al. refers to “review images” as “synthetic documents” for extraction of feature data and validation (¶¶ [0034]- [0036]); ¶ [0040] “Documents could also refer to images of documents”), the physical identification document comprising one or more areas of interest each associated with a fraud signal (¶ [0037] “take in training documents as input…identify likely target regions where documents may be tampered with. These regions may include regions associated with a logo, a signature, a watermark as well as possibly other regions in a document.”);
retrieve, for the review image, a dataset comprising a plurality of reference images, each reference image depicting a reference physical identification document, including automatically selecting one or more reference images for inclusion in the dataset based upon a fraud detection prediction for the reference images as determined by a machine learning classification model (Addison et al. teaches a validation process that includes selecting reference images from a dataset provided by a fraud anticipation module for review based on both algorithmic and manual processes (¶ [0041]); Automated validation is performed using any kind of suitable AI or machine learning model (¶ [0055]); ¶ [0059] “The various machine learning algorithms described above may be implemented using known machine learning software packages and/or frameworks.”) ;
align one or more document features depicted in the review image and the plurality of reference images and crop each aligned image according to one of the areas of interest, including determining a reference pose based upon the review image and transforming the plurality of reference images according to the reference pose (Addison et al. teaches aligning and cropping images through feature extraction which includes generating bounding boxes around features or areas of interest in the document to identify target regions for extraction (¶¶ [0037]-[0038]; ¶ [0042);
identify a fraud detection (Addison et al. teaches a fraud detection and validation process, including by both algorithmic and manual processes, based upon a fraud signal found in the cropped image (¶¶ [0037]- [0038]; ¶ [0041]);
display the cropped review image, the cropped reference images, and the fraud detection (Synthetic data, i.e. review images, and sample documents are updated in databases and available for review (¶ [0041]), target regions generated are output for training (¶¶ [0044]-[0045]), a computer program generates output (¶ [0063]), and output can be displayed (¶ [0065]); ¶ [0065] “To provide for interaction with a user, implementations may be implemented on a computer having a display device”);
receive a response to the fraud detection (Addison et al. teaches the user can provide input and interact with the computing device using a mouse keyboard and other sensory feedback tools (¶ [0065])); and label the review image as genuine or fraudulent based upon the response (¶ [0041] “sample of documents from databases 208 may be reviewed to determine which are actually fraudulent. Then databases 208 may be updated to include information about confirmed fraudulent documents”).
Addison et al. does not explicitly teach identifying a question based on the identified region, displaying the question, and receiving a response to the question from the endpoint computing device.
In a related art, Mathew et al. teaches: a model for extracting and interpreting textual, visual cues including layout, non-textual elements, and style of a document (Sec. 1, P. 2199, ¶ [0003]), identifying questions based on the identified regions (Refer to P. 2199, Figure 1, an example of interpreting layout/structure and answering questions from a dataset). Workers create questions for the Visual Question Answering (VQA) system dataset and flag inapt questions during a second stage (Sec. 3.1, P. 2201, ¶¶ [0003] – [0007]). A response to the question is output using the VQA model (Sec. 4.3, P. 2204).
Addison et al. and Mathew et al. are both considered to be analogous to the claimed invention because they both extract data from regions of interest found in a document. All of the claimed elements were known in the prior art at the time that the invention was make. Therefore, it would have been obvious to a person of ordinary skill in the art prior to the effective filing date of the claimed invention to have modified the teachings of Addison et al. to incorporate the teachings of Mathew et al. and identify a question relating to fraud detection, i.e. a fraud detection question, based on the results of the fraud detection model and the corresponding area of interest found in the review image of the document. It would also have been obvious to a person of ordinary skill in the art prior to the effective filing date of the claimed invention to have modified the teachings of Addison et al. to incorporate the teachings of Mathew et al. to display a fraud detection question, as Addison et al. teaches displaying output, and receive a response to the fraud detection question from the endpoint computing device.
The suggestion/motivation for combining the teachings of Addison et al. and Mathew et al. would have been to provide an improved method of analyzing fraud training data found on a document by implementing a “simultaneous use of visual and textual cues for answering questions asked on document images.” (Mathew et al., P. 2206, Sec. 6). Doing so would make the system more robust and by provide the user with more displayed data.
Regarding Claim 9:
Claim 9 recites claim limitations that equally resemble the limitations rejected in Claim 1. Claim 9 is rejected for the same reasons as rejected above at Claim 1 and for the additional reasons discussed herein.
Addison et al. further teaches: a computerized method of deploying operational fraud detection training data for physical identification documents (Abstract “A system and method to detect fraudulent documents”; ¶ [0005] “can be used to train a fraud classifier for deployment in document validation workflows”; ¶ [0009] “system can be trained for fraud detection of different document types, and therefore, can be used across different domains”), and a predetermined threshold is used to determine whether a review image is labeled as genuine or fraudulent (Each region or the entire document could be classified as potentially fraudulent or not based on “a percentage corresponding to the probability that the region has been tampered with (meaning a fraudulent pattern has been detected in the region). Regions with percentages higher than a threshold percentage could be flagged for further review” (¶ [0057]), by a user or automatically processed without requiring user interaction (¶ [0057] – [0058])).
Regarding Claims 2 and 10:
Addison et al. and Mathew et al. teach the limitations of 1 and 9.
Addison et al. further teaches: wherein each area of interest corresponds to a visual feature of the physical identification document (refer to Abstract, “extracting document features from sample data corresponding to target regions of the documents, such as logo regions and watermark regions.”, and ¶ [0048]).
Regarding Claims 3 and 11:
Addison et al. and Mathew et al. teach the limitations of 1 and 9.
Addison et al. further teaches: wherein each reference image depicts a reference physical identification document previously labeled as genuine or fraudulent (¶ [0026] “the fraud detection system may receive sample documents, which may serve as training documents. In some cases, the sample documents may include both validated documents and fraudulent documents (that is, documents where one or more regions have been tampered or modified) … Validated documents may include documents that have been previously certified or validated as authentic.”).
Regarding Claims 4 and 12:
Addison et al. and Mathew et al. teach the limitations of 1 and 9.
Addison et al. further teaches: executing the machine learning classification model using the reference images as input to generate the fraud detection prediction for each reference image (¶ [0026] “the fraud detection system may receive sample documents, which may serve as training documents.”; ¶ [0027] “Next, in step 104, Al models associated with the fraud detection system may be used to generate synthetic training data.”; ¶ [0028] “The fraud classification model may be any Al or machine learning model that takes one or more documents as input and outputs a classification for the one or more documents. The classification may identify the document(s) as potentially fraudulent or not.”).
Regarding Claims 6 and 14:
Addison et al. and Mathew et al. teach the limitations of 1 and 9.
Addison et al. further teaches: one or more criteria for evaluating whether the review image is genuine or fraudulent (¶ [009] “Furthermore, the embodiments employ a modular approach that easily scalable across document features and types. Models for feature extraction and for generating new fraud patterns can both be easily adapted to accommodate additional features (such as new regions in a document). Additionally, by providing different kinds of initial sample data (documents), the system can be trained for fraud detection of different document types, and therefore, can be used across different domains.”).
Regarding Claims 7 and 15:
Addison et al. and Mathew et al. teach the limitations of 6 and 14.
Addison et al. further teaches: a first indicator that the review image depicts a genuine physical identification document or a second indicator that the review image depicts a fraudulent physical identification document (Addison et al. teaches after the synthetic data is reviewed (i.e. review image), databases are updated to include information (i.e. indicator) on whether the document is fraudulent or not (¶ [0041])).
Regarding Claims 8 and 16:
Addison et al. and Mathew et al. teach the limitations of 1 and 9.
Addison et al. further teaches: wherein the labeled review image is stored for use as a reference image (Addison et al. teaches after the synthetic data is reviewed (i.e. review image) and labeled, it is stored in a database (¶ [0041]) and it can become a sample image for tamper (¶ [0010] - [0012]).
Regarding Claims 17 and 19:
Claims 17 and 19 recites claim limitations that equally resemble the limitations rejected in Claim 1 and 9. Claims 17 and 19 are rejected for the same reasons as rejected above at Claim 1, Claim 9, and for the additional reasons discussed herein.
Addison et al. further teaches the following limitations from claims 17 and 19, not found in claims 1 and 9: a system (and a computerized method) for generating operational fraud detection training data for physical identification documents (Abstract “A system and method to detect fraudulent documents is disclosed. The system uses a generative adversarial network to generate synthetic document data including new fraud patterns.”);
to test users at a plurality of endpoint computing devices (Addison et al. teaches a plurality of user tests, ¶ [0025] “A fraud detection system may be used in a variety of different contexts by a variety of different users… It may be appreciated that in some cases a fraud detection system could be operated by the end-user (such as a government office or business) while in other cases it could be operated by an information technology services provider that runs and maintains the fraud detection system on behalf of the end-user.”. Examiner notes, under the broadest interpretation to a person of ordinary skill in the art, testing users at a plurality of endpoint computing devices, as claimed in Claims 17 and 19, may be interpreted as “a variety of different users” using a fraud detection system (Addison et al. ¶ [0025]); Refer to ¶ ¶ [0065] – [0067] for user related endpoint computing devices);
determine an accuracy of the fraud detection (¶ [0057] “In the exemplary embodiment, fraud classifier 224 may return a percentage corresponding to the probability that the region has been tampered with (meaning a fraudulent pattern has been detected in the region).”;
label the review image as genuine or fraudulent based upon the responses when the accuracy of the fraud detection (¶ [0057] “Regions with percentages higher than a threshold percentage could be flagged for further review…. In other embodiments, each region could simply be classified as potentially fraudulent or not.”).
Addison et al. does not teach to determine accuracy based upon responses to questions, and to use a threshold for determining question accuracy.
In a related art, Mathew et al. teaches: to determine accuracy based upon responses to questions by using an “Accuracy metric” to measure the percentage of questions for which the predicted answer matches exactly with target answers for the question (refer to Mathew et al, P. 2204, Sec. 5.1, ¶ [0001]). Examiner notes a criterion of an exact match of the predicted answer to the target answer, as taught by Mathew et al., is considered a threshold to one of ordinary skill in the art. Examiner also notes Addison et al. and Mathew et al. teach a “fraud detection question”, as rejected in claims 1 and 9 (refer to claim 1 and 9 rejections found hereinabove).
Therefore, it would have been obvious to a person of ordinary skill in the art prior to the effective filing date of the claimed invention to have modified the teachings of Addison et al. to incorporate the teachings of Mathew et al. to determine the accuracy of a fraud detection question based on the responses of answers (taught by Mathew et al.) and to label the review image as genuine or fraudulent (taught by Addison et al.) based on a threshold relating to questions (taught by Mathew et al.). Doing so would provide the system an alternative method of increasing the accuracy of the fraud detection model, thus making the system more robust and reliable for the user.
Regarding Claims 18 and 20:
Addison et al. and Mathew et al. teach the limitations of 17 and 19.
Mathew et al. further teaches comparing the responses to a corpus of pre-labeled response data (Mathew et al. teaches datasets used for comparing data, “In the following analysis we compare statistics of questions, answers and OCR tokens with other similar datasets” (P. 2202, Sec. 3.2, ¶ [0003]). Mathew et al. further teaches Average Normalized Levenshtein Similarity (ANLS) and Accuracy (Acc.) models for comparing “predicted answers”, i.e. responses, to the “target answers”, i.e. pre-labeled response data, for the questions and determining an Accuracy metric percentage. (P. 2204, Sec. 5.1, ¶ [0001])); and generating an accuracy score for the fraud detection question based upon the comparison (“Fraud detection question” equally resembles the limitation rejected in claim 1; Mathew et al. teaches generating an accuracy score for a question based upon the comparison, P. 2204, Sec. 5.1, ¶ [0001] “Accuracy metric awards a zero score even when the prediction is only a little different from the target answer.”).
Claim(s) 5 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Addison et al. (US 20200184212 A1) in view of Mathew et al. ("DocVQA: A Dataset for VQA on Document Images", 2021; Copy provided by examiner) and in further view of Mobley et al. (US 11651093 B1).
Regarding Claims 5 and 13:
Addison et al. and Mathew et al. teach the limitations of 1 and 9.
Addison et al. and Mathew et al. teaches: “wherein the user interface displays the cropped review image, the cropped reference images, and the fraud detection question ”, as rejected in claim 1 and 9.
Addison et al. and Mathew et al. do not explicitly teach displaying elements in a single contiguous view.
In a related art, Mobley et al. teaches: “a user interface for displaying anomalies detected by the mask-overlap detector”, (¶ [0052]). Mobley et al. further teaches portions of target boxes, annotations, and additional information may be displayed adjacent to each other on the user interface (¶¶[0077]-[0080]). Mobley et al. is considered to be analogous to the claimed invention because they are in the same field of document fraud detection.
Therefore, it would have been obvious to a person of ordinary skill in the art prior to the effective filing date of the claimed invention to have modified the teachings of Addison et al. and Mathew et al. to incorporate the teachings of Mobley et al. and display multiple elements from the fraud detection system (e.g. the cropped review image, the cropped reference images, and the fraud detection questions), as taught by Addison et al. and Mathew, adjacent to one and another, as taught by Mobley et al.. Doing so would aid allow the user to view all of the results of the fraud detection data in one view, thus increasing the speed and accuracy of one’s ability to interpret the system’s results.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SAMUEL DAVID BAYNES whose telephone number is (571)272-0607. The examiner can normally be reached Monday - Friday 8:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Stephen R Koziol can be reached at (408) 918-7630. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/S.D.B./
Samuel D. Baynes
Examiner, Art Unit 2665
/Stephen R Koziol/Supervisory Patent Examiner, Art Unit 2665