DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
In the amendment filed on January 2, 2026, the following has occurred: claim(s) 1, 9, 17 have been amended. Now, claim(s) 1-5, 7-18 are pending.
Notice to Applicant
The Examiner has not rejected the current claims 1-5, 7-18 under 35 U.S.C. 101 as the recited claims as recited recite statutory subject matter. The claims are directed to a process that implements the integration of training a library of entity recognition models, wherein the library of entity recognition models comprises a plurality of machine learning models each trained using a common training dataset and a distinct set of training parameters, wherein the training dataset comprises a plurality of medical reports in various medical report categories, and associated lists of ground truth entity annotation vectors corresponding to the plurality of medical reports, wherein each set of training parameters is determined based on one or more attributes of one or more medical report categories and wherein each set of training parameters comprises a distinct set of loss adjustment factors for a list of target entity classes to account for class imbalances, combined with detecting attributes in the medical report using the library of entity recognition models, and identifying a plurality of named entities in the medical report using the entity recognition model of the library of trained entity recognition models by tokenizing the medical report to produce a plurality of tokens, encoding the plurality of tokens into an embedding space using a deep neural network, to produce a plurality of embedding vectors corresponding to the plurality of tokens. The steps of training a library of entity recognition models, and identifying a plurality of named entities in the medical report using the entity recognition model of the library of trained entity recognition models by tokenizing the medical report to produce a plurality of tokens, encoding the plurality of tokens into an embedding space using a deep neural network, to produce a plurality of embedding vectors corresponding to the plurality of tokens could not be practically performed in the mind of a user or be considered a mathematical concept or a certain method of organizing human activity as currently recited. Applicant’s Specification in Paragraph [0107] recites “The technical effect of adjusting losses for a plurality of named entity classes based on a medical report category and a frequency of the named entity annotations in the training dataset is that an entity recognition model may be produced which has a reduced probability of mis-labeling named entities carrying relevant information for the report category, even when said named entities belong to an under-represented category of named entities within the training dataset. Further, by scaling losses for a plurality of named entity classes, as opposed to dropping or pooling named entity classes to force a model to prioritize one or more target classes, a probability of overfitting may be reduced, and an efficiency of knowledge extraction from the training dataset may be improved.”, and improving the efficiency of knowledge extraction is reflected in the claimed invention to achieve a technical improvement. The recited claims 1-5, 7-18 recite statutory subject matter.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-5, 7-8, and 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Lucas et al. (U.S. Patent Pre-Grant Publication No. 2020/0176098) in view of Nguyen et al. (U.S. Patent Publication No. 10,496,884).
As per independent claim 1, Lucas discloses a method comprising:
receiving a medical report for a patient (See Paragraphs [0027], [0042]: The system may detect when a patient record has been received, either partially or in full, and begin processing the patient record in aggregate or as a whole to determine relevant medical concepts for entry into the EMR, which the Examiner is interpreting the system may detect when a patient record has been received to encompass receiving a medical report for a patient);
classifying the medical report into a category of a plurality of pre-determined categories (See Paragraphs [0039]-[0042], [0206]: A combination of NLP and supervised, semi-supervised, or unsupervised MLA techniques may be used to generate an intelligent training set of data to recognize entries from the enumerated list of clinical drugs, in order to identify patterns within the text of abstracted documents which typically surround drug entries, which the Examiner is interpreting the pre-defined entries ([0206]) to encompass a category of a plurality of pre-determined categories);
matching the medical report with an entity recognition model from the library of entity recognition models based on the category (See Paragraphs [0101]-[0102], [0118]-[0119], [0157]-[0158]: Language models may vary based upon the type of document being processed, (e.g., pathology reports, progress notes, and other EHR and EMR documents, etc.), to optimize the type of information which may be extracted from the documents, which the Examiner is interpreting the language models to encompass entity recognition models, and identifying the type of document to encompass matching the medical report with an entity recognition model);
identifying a plurality of named entities in the medical report using the entity recognition model of the library of trained entity recognition models (See Paragraphs [0103], [0157]-[0162]: A probability distribution may be generated by applying a neural network for Named Entity Recognition (NER), a machine learning algorithm or a neural network may process the training data to generate a rule set or a trained neural network, which the Examiner is interpreting the identification of words and respective weights to encompass a plurality of named entities when combined with Nguyen described below) by tokenizing the medical report to produce a plurality of tokens, encoding the plurality of tokens into an embedding space using a deep neural network, to produce a plurality of embedding vectors corresponding to the plurality of tokens (See Paragraphs [0119]-[0121]: For sentences which are noisy (e.g., structured, but with unclear boundaries), a maximum entropy approach may be utilized, and in texts which are very specialized in nature (e.g., medical texts, legal texts, etc.), a tokenization and document segmentation algorithm may be applied, which the Examiner is interpreting a tokenization and document segmentation algorithm may be applied to encompass tokenizing the medical report to produce a plurality of tokens, encoding the plurality of tokens into an embedding space using a deep neural network, to produce a plurality of embedding vectors corresponding to the plurality of tokens (Paragraphs [0156], [0163]));
refining the plurality of named entities to produce a summary of the medical report (See Paragraphs [0238]-[0239]: The post-processing pipeline stage may receive a listing of all the structured entities and generate a response/report, and under a Diagnosis header/identifier, structured entities relating to diagnosis may be summarized with the final normalized entity, information from the entity structuring, and any confidence values generated during the classification and/or ranking/filtering, which the Examiner is interpreting the structured entities relating to diagnosis may be summarized with the final normalized entity to encompass refining the plurality of named entities to produce a summary of the medical report); and
displaying the summary of the medical report via a display device (See Paragraphs [0238]-[0239], [0252]: The Workbench may represent a server for maintaining a user interface (UI) to implement a patient record analysis system responsible for managing the flow of information between systems of the instant architecture and/or stage of the processing pipeline.)
While Lucas discloses the method as described above, Lucas may not explicitly teach training a library of entity recognition models, wherein the library of entity recognition models comprises a plurality of machine learning models each trained using a common training dataset and a distinct set of training parameters,
wherein the training dataset comprises a plurality of medical reports in various medical report categories, and associated lists of ground truth entity annotation vectors corresponding to the plurality of medical reports,
wherein each set of training parameters is determined based on one or more attributes of one or more medical report categories and wherein each set of training parameters comprises a distinct set of loss adjustment factors for a list of target entity classes to account for class imbalances, wherein the distinct set of loss adjustment factors are applied during training to the common training dataset without reducing a quantity of training samples from any entity class, and, wherein a first set of loss adjustment factors for a list of target entity classes of underrepresented entities are each set to a value greater than a threshold value, and a second set of loss adjustment factors for a list of target entity classes of overrepresented entities are each set to a value equal to or less than the threshold value;
detecting attributes in a medical report using the library of entity recognition models.
Nguyen teaches a method comprising: training a library of entity recognition models (See col. 22, ll. 63-67, col. 23, ll. 1-8, 44-67, col. 24, ll. 1-22: Studies are used to train a network, and the methods of the present disclosure can apply the tensor reshaping available in network libraries for a different purpose typically applied, which the Examiner is interpreting train a network to encompass training a library of entity recognition models when combined with the language models of Lucas), wherein the library of entity recognition models comprises a plurality of machine learning models each trained using a common training dataset and a distinct set of training parameters (See col. 13, ll. 32-67, col. 14, ll. 1-9: A computer algorithm that implements one or more machine learning algorithms, can be used to process the training set, the neural network is designed to classify images and classes, the system can apply the parameters learned during training to produce an estimate of the class label for the new image which the Examiner is interpreting a computer algorithm that implements one or more machine learning algorithms to encompass the library of entity recognition models comprises a plurality of machine learning model, the parameters learned during training to encompass a distinct set of training parameters),
wherein the training dataset comprises a plurality of medical reports in various medical report categories, and associated lists of ground truth entity annotation vectors corresponding to the plurality of medical reports (See col. 4, ll. 3-12, col. 22, ll. 52-67, col. 23, ll. 1-8, col. 25, ll. 40-59: Study-level of annotation is easier to obtain and this annotation is consistent with the normal work-flow of a typical radiologist, which in turn allows training networks with less human involvement and less radiologist-hour-cost, which the Examiner is interpreting study-level annotation to encompass ground truth entity annotation vectors (col. 25, ll. 40-59), and when combined with Lucas’ disclosure of medical reports in Paragraphs [0027], [0039]-[0042]),
wherein each set of training parameters is determined based on one or more attributes of one or more medical report categories (See col. 12, ll. 21-42: Reports, images, and labels associated with the images can be stored in the database and can be categorized, which the Examiner is interpreting the normal, clear, or abnormal labels to encompass one or more attributes of one or more medical report categories) and wherein each set of training parameters comprises a distinct set of loss adjustment factors for a list of target entity classes to account for class imbalances (See col. 16, ll. 53-67, col. 17, ll. 1-20, col. 25, ll. 26-59: col. 16, ll. 53-67 and col. 17, 1-20 discuss class balance, which the Examiner is interpreting to encompass a list of target entity classes to account for class imbalances as the flattening block can flatten the input into a vector, and the study score and associated loss can then be calculated, based on a study label, and in col. 15, ll. 58-67, col. 16, ll. 1-12 as the hinge loss functions of Nguyen are used to adjust the network parameters to optimize performance an accuracy of the network, which the Examiner is interpreting the calculation of hinge loss to encompass a distinct set of loss adjustment factors), wherein the distinct set of loss adjustment factors are applied during training to the common training dataset without reducing a quantity of training samples from any entity class (See col. 15, ll. 45-67, col. 16, ll. 1-12: Hinge loss calculations can be used during training and testing of the network, which the Examiner is interpreting to encompass the claimed portion), and wherein a first set of loss adjustment factors for a list of target entity classes of underrepresented entities are each set to a value greater (See col. 16, ll. 53-67, col. 17, ll. 1-57: Two classes can be identified for a study, and over-representation of data can be identified for one class compared to the other class, which the Examiner is interpreting the under-represented class to encompass a list of target entity classes of underrepresented entities, and comparison of the classification score to a threshold, and if above the threshold, its flagged for a particular classification (See col. 18, ll. 56-67) to encompass each set to a value greater than the threshold) than a threshold value (See col. 18, ll. 56-67, col. 19, ll. 1-3: A score can be compared to a threshold, and if above the threshold, it is flagged for a particular classification, which the Examiner is interpreting comparing the score to a threshold to encompass a threshold value), and a second set of loss adjustment factors for a list of target entity classes of overrepresented entities are each set to a value equal to or less than (See col. 16, ll. 53-67, col. 17, ll. 1-57: Two classes can be identified for a study, and over-representation of data can be identified for one class compared to the other class, which the Examiner is interpreting the over-representation class to encompass a list of target entity classes of overrepresented entities, and comparison of the classification score to a threshold, (See col. 18, ll. 56-67) to encompass each set to a value equal or less than the threshold value) the threshold value (See col. 18, ll. 56-67, col. 19, ll. 1-3: A score can be compared to a threshold, and if above the threshold, it is flagged for a particular classification, which the Examiner is interpreting comparing the score to a threshold to encompass a threshold value));
detecting attributes in a medical report using the library of entity recognition models (See col. 14, ll. 10-27: The convolution layer’s parameters consist of a set of learnable filters that can be trained to detect certain features in an image, which the Examiner is interpreting features to encompass attributes.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed to modify the method of Lucas to include training a library of entity recognition models, wherein the library of entity recognition models comprises a plurality of machine learning models each trained using a common training dataset and a distinct set of training parameters, wherein the training dataset comprises a plurality of medical reports in various medical report categories, and associated lists of ground truth entity annotation vectors corresponding to the plurality of medical reports, wherein each set of training parameters is determined based on one or more attributes of one or more medical report categories and wherein each set of training parameters comprises a distinct set of loss adjustment factors for a list of target entity classes to account for class imbalances, wherein the distinct set of loss adjustment factors are applied during training to the common training dataset without reducing a quantity of training samples from any entity class, and wherein a first set of loss adjustment factors for a list of target entity classes of underrepresented entities are each set to a value greater than a threshold value, and a second set of loss adjustment factors for a list of target entity classes of overrepresented entities are each set to a value equal to or less than the threshold value; detecting attributes in a medical report using the library of entity recognition models as taught by Nguyen. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Lucas with Nguyen with the motivation of improving the network performance applied to medical images by learning bias (See Detailed Description of the Invention of Nguyen in col. 15, ll. 6-9).
Claim(s) 17 mirrors claim 1 only within a different statutory category, and is rejected for the same reason as claim 1. The additional elements added in claim 17 are encompassed by Lucas. Lucas discloses an electronic medical records database (See Paragraph [0096]: A database of multiple documents, or another form of patient record, the request may pass through a pre-processing subroutine, a parsing subroutine, a dictionary lookup subroutine, a normalization subroutine, a structuring subroutine for filtering and/or ranking, and a post-processing subroutine in order to generate and serve a response to a remainder of the system.); and a patient summary system communicatively coupled to the electronic medical records database (See Paragraph [0247]: Document Pipeline 805, or corrected documents via the Workbench 810 (introduced below), uploads documents, and posts them to a server that coordinates a number of tasks and manages the intake of documents for the intake pipeline described in FIG. 1), the patient summary system comprising: instructions stored in non-transitory memory of the patient summary system (See Paragraph [0264]: The example computer system 900 includes a processing device 902, a main memory 904 (such as read-only memory (ROM), flash memory, dynamic random access memory (DRAM) such as synchronous DRAM (SDRAM) or DRAM, etc.), a static memory 906 (such as flash memory, static random access memory (SRAM), etc.), and a data storage device 918, which communicate with each other via a bus 930.); and a processor, that when executing the instructions causes the patient summary system to: (See Paragraph [0265]: Processing device 902 represents one or more general-purpose processing devices such as a microprocessor, a central processing unit, or the like.)
As per claim 2, Lucas/Nguyen discloses the method of claim 1 as described above. Lucas further teaches classifying the medical report into a category of a plurality of pre-determined categories comprises encoding the medical report as a feature vector, using one or more of a text and a metadata of the medical report and assigning the medical report to a category of the plurality of pre-determined categories based on the feature vector (See Paragraphs [0158]-[0160], [0206]: The whole document classifier may rely on a training model that has been trained on thousands of medical documents found in MERs and EHRs of patients, each sentence in a document may be processed to assign a document vector, and each document in a patient’s EMR or EHR may be processed to assign a patient vector, which the Examiner is interpreting the document vector to encompass encoding the medical report as a feature vector, using one or more of a text and a metadata of the medical report, and pre-defined entries ([0206]) can include categories, which the Examiner is interpreting to encompass a category of the plurality of pre-determined categories based on the feature vector.)
As per claim 3, Lucas/Nguyen discloses the method of claim 1 as described above. Lucas further teaches wherein the plurality of machine learning models includes at least one trained entity recognition model for each of the plurality of pre-determined categories (See Paragraphs [0101]-[0102], [0118]-[0119], [0157]-[0158]: Language models may vary based upon the type of document being processed, (e.g., pathology reports, progress notes, and other EHR and EMR documents, etc.), to optimize the type of information which may be extracted from the documents, which the Examiner is interpreting the language models to encompass entity recognition models, and the type of document to encompass a pre-determined category), and wherein the distinct set of training parameters associated with each of the plurality of machine learning models is determined based on a category of the plurality of pre-determined categories (See Paragraphs [0062]-[0063]: Training may include providing optimized datasets, labeling these traits as they occur in patient records, and training the MLA to predict or classify based on new inputs, artificial NNs are efficient computing models which have shown their strengths in solving hard problems in artificial intelligence, they have also been shown to be universal approximators (can represent a wide variety of functions when given appropriate parameters), which the Examiner is interpreting optimized datasets to encompass the distinct set of training parameters, and identify features of importance to encompass a category of the plurality of pre-determined categories.)
As per claim 4, Lucas/Nguyen discloses the method of claim 1 as described above. Lucas further teaches wherein the distinct set of training parameters for the entity recognition model comprises a set of loss adjustment factors for a list of target entity classes (See Paragraphs [0103]-[0106]: Individual words may be provided a weighting factor for probability of occurrence across a massive training set, which the Examiner is interpreting a weighting factor for probability of occurrence to encompass a set of loss adjustment factors for a list of target entity classes), wherein the set of loss adjustment factors and the list of target entity classes for the entity recognition model are determined based on the category of the medical report (See Paragraphs [0103]-[0106]: A tabular extraction method may be performed across EMR and EHR documents, a resulting ruleset or neural network may recognize features across a standardized report signifying that a classification may be extracted from a specific section of a particular report.)
As per claim 5, Lucas/Nguyen discloses the method of claim 1 as described above. Lucas further teaches wherein the threshold value is one (See Paragraphs [0103]-[0106]: The training data may weigh the occurrence of words in medical texts, the scores may be below 1 to identify a lower probability, and a higher probability would have a higher score to be more relevant, which the Examiner is interpreting to encompass the threshold value is one when combined with Nguyen’s disclosure of threshold in col. 18, ll. 56-60.)
As per claim 7, Lucas/Nguyen discloses the method of claim 1 as described above. Lucas further teaches wherein matching the medical report with the entity recognition model from the library of entity recognition models comprises: extracting metadata from the medical report (See Paragraphs [0024]-[0026], [0054]-[0055]: The system may use a combination of text extraction techniques, text cleaning techniques, natural language processing techniques, machine learning algorithms, and medical concept (Entity) identification, normalization, and structuring techniques); classifying the medical report into the category of the plurality of pre-determined categories (See Paragraphs [0039]-[0042], [0206]: A combination of NLP and supervised, semi-supervised, or unsupervised MLA techniques may be used to generate an intelligent training set of data to recognize entries from the enumerated list of clinical drugs, in order to identify patterns within the text of abstracted documents which typically surround drug entries, which the Examiner is interpreting the pre-defined entries ([0206]) to encompass a category of a plurality of pre-determined categories); and mapping the category to the entity recognition model in the library of entity recognition models (See Paragraphs [0101]-[0102], [0118]-[0119], [0157]-[0158]: Language models may vary based upon the type of document being processed, (e.g., pathology reports, progress notes, and other EHR and EMR documents, etc.), to optimize the type of information which may be extracted from the documents, which the Examiner is interpreting the language models to encompass entity recognition models, and identifying the type of document to encompass mapping the category to an entity recognition model.)
As per claim 8, Lucas/Nguyen discloses the method of claim 1 as described above. Lucas further teaches wherein matching the medical report with the entity recognition model from the library of entity recognition models comprises: encoding the medical report as a feature vector (See Paragraphs [0158]-[0160]: The rule sets may include a vector of, for example, three hundred words and their respective weights, and each rule set may be applied over all words in a sentence to generate weights for every sentence); classifying the medical report into the category of the plurality of pre-determined categories based on a proximity of the feature vector to one or more category clusters in a feature vector space (See Paragraphs [0169]-[0171]: By considering proximity to other concept candidates, key information may be retained even if the concept may not exist in the database, which the Examiner is interpreting concept candidates to encompass category clusters in feature vector space); and mapping the category to the entity recognition model in the library of entity recognition models (See Paragraphs [0101]-[0102], [0118]-[0119], [0157]-[0158]: Language models may vary based upon the type of document being processed, (e.g., pathology reports, progress notes, and other EHR and EMR documents, etc.), to optimize the type of information which may be extracted from the documents, which the Examiner is interpreting the language models to encompass entity recognition models, and identifying the type of document to encompass mapping the category to an entity recognition model.)
As per claim 18, Lucas discloses the system of claim 17 as described above. Lucas further teaches the system further comprising a care provider device communicatively coupled to the patient summary system (See Paragraph [0266]: The computer system may further include a network interface device for connecting to the LAN, intranet, internet, and/or the extranet, the computer system also may include a video display unit (such as a liquid crystal display (LCD) or a cathode ray tube (CRT)), an alphanumeric input device (such as a keyboard), a cursor control device (such as a mouse), a signal generation device (such as a speaker), and a graphic processing unit 924 (such as a graphics card)), and wherein the processor is configured to display the summary of the medical report via the display device by: transmitting the summary of the medical report to the care provider device, wherein the care provider device includes the display (See Paragraphs [0238]-[0239], [0252]: The Workbench may represent a server for maintaining a user interface (UI) to implement a patient record analysis system responsible for managing the flow of information between systems of the instant architecture and/or stage of the processing pipeline); and displaying the summary of the medical report via the display device of the care provider device (See Paragraphs [0238]-[0239], [0252]: The Workbench may represent a server for maintaining a user interface (UI) to implement a patient record analysis system responsible for managing the flow of information between systems of the instant architecture and/or stage of the processing pipeline.)
Claims 9-15 are rejected under 35 U.S.C. 103 as being unpatentable over Lucas et al. (U.S. Patent Pre-Grant Publication No. 2020/0176098) in view of Nguyen et al. (U.S. Patent Publication No. 10,496,884) in view of Vianu et al. (U.S. Patent Pre-Grant Publication No. 2020/0334809).
As per independent claim 9, Lucas discloses a method comprising:
selecting a category from a plurality of pre-determined medical report categories (See Paragraphs [0039]-[0042], [0206]: A combination of NLP and supervised, semi-supervised, or unsupervised MLA techniques may be used to generate an intelligent training set of data to recognize entries from the enumerated list of clinical drugs, in order to identify patterns within the text of abstracted documents which typically surround drug entries, which the Examiner is interpreting the pre-defined entries ([0206]) to encompass a category of a plurality of pre-determined categories);
determining a plurality of training parameters based on the category (See Paragraphs [0062]-[0063]: Training may include providing optimized datasets, labeling these traits as they occur in patient records, and training the MLA to predict or classify based on new inputs, artificial NNs are efficient computing models which have shown their strengths in solving hard problems in artificial intelligence, they have also been shown to be universal approximators (can represent a wide variety of functions when given appropriate parameters), which the Examiner is interpreting the optimized datasets to encompass a plurality of training parameters ([0061])), wherein the plurality of training parameters comprises distinct sets of loss adjustment factors for a list of target entity classes to account for class imbalances, and wherein the distinct sets of loss adjustment factors are applied during training to a complete training dataset without reducing a quantity of training samples from any entity class;
mapping the medical report to a list of entity classifications using an entity recognition model (See Paragraphs [0101]-[0102], [0118]-[0119], [0157]-[0158]: Language models may vary based upon the type of document being processed, (e.g., pathology reports, progress notes, and other EHR and EMR documents, etc.), to optimize the type of information which may be extracted from the documents, which the Examiner is interpreting the language models to encompass entity recognition models, and identifying the type of document to encompass mapping the medical report with an entity recognition mode), wherein mapping the medical report to a list of entity classifications using the entity recognition model comprises:
tokenizing the medical report to produce a plurality of tokens (See Paragraphs [0157]-[0160]: Each respective sentence may be assigned a sentence vector (e.g., 10% female, 90% male), then each sentence in a document may be processed to assign a document vector, and finally, each document in a patient's EMR or EHR may be processed to assign a patient vector, which the Examiner is interpreting the vector to encompass a token);
encoding the plurality of tokens into an embedding space using a deep neural network, to produce a plurality of embedding vectors ((See Paragraphs [0119]-[0121]: For sentences which are noisy (e.g., structured, but with unclear boundaries), a maximum entropy approach may be utilized, and in texts which are very specialized in nature (e.g., medical texts, legal texts, etc.), a tokenization and document segmentation algorithm may be applied, which the Examiner is interpreting a tokenization and document segmentation algorithm may be applied to encompass encoding the plurality of tokens into an embedding space using a deep neural network, to produce a plurality of embedding vectors (Paragraphs [0156], [0163])); and
mapping each of the plurality of embedding vectors to a corresponding entity classification to produce the list of entity classifications (See Paragraphs [0094]-[0096], [0201]-[0202]: An Ontological graphing algorithm may be incorporated into the normalization, in lieu of an ontological graphing algorithm, normalization may include a second dictionary lookup algorithm with hardcoded mappings to relevant concepts, hardcoded mappings and/or alternatives may be identified in real time during processing by comparing mappings and alternatives to a selection criteria, such as those mappings and alternatives which are located in pre-approved databases, which the Examiner is interpreting mappings and alternatives which are located in pre-approved databases to encompass produce the list of entity classifications), wherein each entity classification comprises a vector of entity classification scores for a plurality of entity classes (See [0160]-[0162]: When a level of certainty lies below a threshold value (e.g., 90%), the whole document classifier may output the highest level vector calculated identifying, for example, a 60% confidence male and 40% confidence female, which the Examiner is interpreting the whole document classifier may output the highest level vector to encompass a vector of entity classification scores for a plurality of entity classes);
determining a base loss for each entity classification in the list of entity classifications by comparing each entity classification with a corresponding ground truth entity annotation from the list of ground truth entity annotations (See Paragraphs [0154]-[0156]: Concept candidates may be determined by noting important phrase types (e.g., NP, CD, etc.) and may be further refined by comparing any associated text against a list of weighted words, whereby words which are weighted above a threshold weight may be presented as concept candidates, and an MLA may be utilized to identify concept candidates, an exemplary MLA for identifying concept candidates includes a name entity recognition (NER) model, which the Examiner is interpreting comparing any associated text against a list of weighted words to encompass comparing each entity classification with a corresponding ground truth entity annotation from the list of ground truth entity annotations when combined with Vianu’s disclosure of ground truths and training data pairs) using a loss function, the loss function comprising a sum of squared errors function or a categorical cross-entropy loss function, wherein the base loss for each entity classification comprises a base loss vector with a plurality of loss values corresponding to the plurality of entity classes;
adjusting the base loss for each entity classification based on the plurality of training parameters to produce a list of adjusted losses (See Paragraphs [0103]-[0106]: Individual words may be provided a weighting factor for probability of occurrence across a massive training set, which the Examiner is interpreting a weighting factor for probability of occurrence to encompass produce a list of adjusted losses), wherein adjusting the base loss comprises scaling each loss in the base loss vector by a corresponding loss adjustment factor from a loss adjustment factor vector to produce an adjusted loss vector, and wherein the loss adjustment factor vector comprises a plurality of loss adjustment factors each corresponding to one of the plurality of entity classes, and summing the adjusted losses in the adjusted loss vector to produce a single adjusted loss;
updating parameters of the entity recognition model based on the list of adjusted losses (See Paragraphs [0248]-[0250], [0254]: The system also may check the database to determine whether improved NLP models have been provided and retrieve any new or updated models), wherein updating parameters of the entity recognition model based on the list of adjusted losses comprises determining an approximation of a gradient of the loss function for each of the plurality of training parameters and updating the parameters of the entity recognition model based upon the approximation of the gradient; and
storing the entity recognition model in an entity recognition model library (See Paragraphs [0248]-[0250], [0254], [0260]-[0262]: The system also may check the database to determine whether improved NLP models have been provided and retrieve any new or updated models, and the Model Training module existing NLP models from File Storage (some models can be trained multiple times and improve over many iterations of new data), which the Examiner is interpreting he Model Training module existing NLP models from File Storage to encompass storing the entity recognition model in an entity recognition model library as the File Storage possess NLP models.)
While Lucas discloses determining a plurality of training parameters based on the category;
updating parameters of the entity recognition model based on the list of adjusted losses,
Lucas may not explicitly teach wherein the plurality of training parameters comprises distinct sets of loss adjustment factors for a list of target entity classes to account for class imbalances, and wherein the distinct sets of loss adjustment factors are applied during training to a complete training dataset without reducing a quantity of training samples from any entity class;
and wherein the loss adjustment factor vector comprises a plurality of loss adjustment factors each corresponding to one of the plurality of entity classes,
updating parameters of the entity recognition model based on the list of adjusted losses, wherein updating parameters of the entity recognition model based on the list of adjusted losses comprises determining an approximation of a gradient of the loss function for each of the plurality of training parameters and updating the parameters of the entity recognition model based upon the approximation of the gradient.
Nguyen teaches a method wherein the plurality of training parameters comprises distinct sets of loss adjustment factors for a list of target entity classes to account for class imbalances (See col. 16, ll. 53-67, col. 17, ll. 1-20, col. 25, ll. 26-59: col. 16, ll. 53-67 and col. 17, 1-20 discuss class balance, which the Examiner is interpreting to encompass a list of target entity classes to account for class imbalances as the flattening block can flatten the input into a vector, and the study score and associated loss can then be calculated, based on a study label), and wherein the distinct sets of loss adjustment factors are applied during training to a complete training dataset without reducing a quantity of training samples from any entity class (See col. 15, ll. 45-67, col. 16, ll. 1-12: Hinge loss calculations can be used during training and testing of the network, which the Examiner is interpreting to encompass the claimed portion);
adjusting the base loss for each entity classification based on the plurality of training parameters to produce a list of adjusted losses, wherein adjusting the base loss comprises scaling each loss in the base loss vector by a corresponding loss adjustment factor from a loss adjustment factor vector to produce an adjusted loss vector, and wherein the loss adjustment factor vector comprises a plurality of loss adjustment factors each corresponding to one of the plurality of entity classes (See col. 15, ll. 45-67, col. 16, ll. 1-11: The hinge loss functions are used to adjust the network parameters, to optimize its performance and accuracy, which the Examiner is interpreting the hinge loss functions to encompass a plurality of loss adjustment factors each corresponding to one of the plurality of entity classes), and summing the adjusted losses in the adjusted loss vector to produce a single adjusted loss;
updating parameters of the entity recognition model based on the list of adjusted losses, wherein updating parameters of the entity recognition model based on the list of adjusted losses comprises determining an approximation of a gradient of the loss function for each of the plurality of training parameters and updating the parameters of the entity recognition model based upon the approximation of the gradient (See col. 5, ll. 16-35, col. 19, ll. 36-52: The Examiner is interpreting the gradient descent to encompass an approximation of a gradient of the loss function for each of the plurality of training parameters and the updating of the network to encompass updating the parameters of the entity recognition model based upon the approximation of the gradient.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed to modify the method of Lucas to include the plurality of training parameters comprises distinct sets of loss adjustment factors for a list of target entity classes to account for class imbalances, and wherein the distinct sets of loss adjustment factors are applied during training to a complete training dataset without reducing a quantity of training samples from any entity class; the loss adjustment factor vector comprises a plurality of loss adjustment factors each corresponding to one of the plurality of entity classes, updating parameters of the entity recognition model based on the list of adjusted losses, wherein updating parameters of the entity recognition model based on the list of adjusted losses comprises determining an approximation of a gradient of the loss function for each of the plurality of training parameters and updating the parameters of the entity recognition model based upon the approximation of the gradient as taught by Nguyen. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Lucas with Nguyen with the motivation of improving the network performance applied to medical images by learning bias (See Detailed Description of the Invention of Nguyen in col. 15, ll. 6-9).
While Lucas discloses determining a base loss for each entity classification in the list of entity classifications by comparing each entity classification with a corresponding ground truth entity annotation from the list of ground truth entity annotations;
adjusting the base loss for each entity classification based on the plurality of training parameters to produce a list of adjusted losses, and wherein the loss adjustment factor vector comprises a plurality of loss adjustment factors each corresponding to one of the plurality of entity classes; and
Lucas/Nguyen may not explicitly teach selecting a training data pair, wherein the training data pair comprises a medical report and a list of ground truth entity annotations;
determining a base loss for each entity classification in the list of entity classifications by comparing each entity classification with a corresponding ground truth entity annotation from the list of ground truth entity annotations using a loss function, the loss function comprising a sum of squared errors function or a categorical cross-entropy loss function, wherein the base loss for each entity classification comprises a base loss vector with a plurality of loss values corresponding to the plurality of entity classes;
adjusting the base loss for each entity classification based on the plurality of training parameters to produce a list of adjusted losses, wherein adjusting the base loss comprises scaling each loss in the base loss vector by a corresponding loss adjustment factor from a loss adjustment factor vector to produce an adjusted loss vector, and summing the adjusted losses in the adjusted loss vector to produce a single adjusted loss.
Vianu teaches a method for selecting a training data pair, wherein the training data pair comprises a medical report and a list of ground truth entity annotations (See Paragraphs [0147]-[0148], [0171]-[0172]: The identified pairs or groups of anatomical sections identified from the radiological images and reports, and any given segment or input report text has an associated ground truth, which in this case can be thought of as the diagnosis as the reviewing physician/radiologist intended to read the radiological images, which the Examiner is interpreting the radiological images and reports to encompass a medical report and the associated ground truth to encompass a list of ground truth entity annotations);
determining a base loss for each entity classification in the list of entity classifications by comparing each entity classification with a corresponding ground truth entity annotation from the list of ground truth entity annotations using a loss function, the loss function comprising a sum of squared errors function or a categorical cross-entropy loss function (See Paragraphs [0032], [0161]-[0163]: The total loss functions for the given training data pair further includes the Siamese loss ([0032]), and the regularizers contribute to the overall loss function that is used to train ML network, the training of ML network is driven by categorical cross entropy loss ([0161])), wherein the base loss for each entity classification comprises a base loss vector with a plurality of loss values corresponding to the plurality of entity classes (See Paragraphs [0032], [0161]-[0166]: By defining an additional loss component that is incorporated into the overall loss function used to train ML network, each of the two regularizer networks specifically targets and refines the manner in which first encoder learns or generates word embeddings for the sections of report text, and in embodiments where the embeddings take the form of real-valued vectors within a pre-defined vector space, semantic information of the input report text 422 is in theory captured by the expectation that embeddings for semantically or syntactically related words will be closer to each other in the vector space than to unrelated words in the vector space, which the Examiner is interpreting the form of real-valued vectors to encompass a base loss vector with a plurality of loss values corresponding to the plurality of entity classes);
adjusting the base loss for each entity classification based on the plurality of training parameters to produce a list of adjusted losses, wherein adjusting the base loss comprises scaling each loss in the base loss vector by a corresponding loss adjustment factor from a loss adjustment factor vector to produce an adjusted loss vector (See Paragraph [0061]: The measures or other outputs of uncertainty from one or more components of the presently disclosed machine learning network(s) can be expressed as a feature vector, which can then be used as an input feature for the disclosed Bayesian approach to estimating physician's accuracies in diagnosing a pathology, which the Examiner is interpreting a feature vector to encompass a corresponding loss adjustment vector when combined with Nguyen’s disclosure of loss function in col. 24, ll. 1-5), and wherein the loss adjustment factor vector comprises a plurality of loss adjustment factors each corresponding to one of the plurality of entity classes, and summing the adjusted losses in the adjusted loss vector to produce a single adjusted loss (See Paragraphs [0061], [0192]-[0195]: The five different losses are aggregated into a final total loss function to train the overall ML network, which the Examiner is interpreting the five different losses are aggregated into a final total loss function to encompass summing the adjusted losses in the adjusted loss vector to produce a single adjusted loss.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed to modify the method of Lucas/Nguyen to include selecting a training data pair, wherein the training data pair comprises a medical report and a list of ground truth entity annotations; determining a base loss for each entity classification in the list of entity classifications by comparing each entity classification with a corresponding ground truth entity annotation from the list of ground truth entity annotations using a loss function, the loss function comprising a sum of squared errors function or a categorical cross-entropy loss function; adjusting the base loss for each entity classification based on the plurality of training parameters to produce a list of adjusted losses, wherein adjusting the base loss comprises scaling each loss in the base loss vector by a corresponding loss adjustment factor from a loss adjustment factor vector to produce an adjusted loss vector, and summing the adjusted losses in the adjusted loss vector to produce a single adjusted loss as taught by Vianu. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Lucas/Nguyen with Vianu with the motivation of improving the assessment of physicians' accuracies in delivering their diagnoses (See Detailed Description of Vianu in Paragraph [0208]).
As per claim 10, Lucas/Nguyen/Vianu discloses the method of claim 9 as described above. Lucas further teaches wherein adjusting the base loss for each entity classification based on the plurality of training parameters, comprises: determining if the corresponding ground truth entity annotation matches a target class from the list of target entity classes (See Paragraphs [0103]-[0106], [0206]-[0207], [0239]: Individual words may be provided a weighting factor for probability of occurrence across a massive training set, which the Examiner is interpreting a weighting factor for probability of occurrence to encompass a set of loss adjustment factors for a list of target entity classes, and text matching can occur which the Examiner is interpreting to encompass determining if the corresponding ground truth entity annotation matches a target class); and responding to the corresponding ground truth entity annotation matching the target class by: selecting a loss adjustment factor from the list of loss adjustment factors based on the target class (See Paragraphs [0103]-[0106]: Individual words may be provided a weighting factor for probability of occurrence across a massive training set, which the Examiner is interpreting a weighting factor for probability of occurrence to encompass a loss adjustment factor from the list of loss adjustment factors); and scaling the base loss by the loss adjustment factor to produce an adjusted loss (See Paragraphs [0206]-[0208]: The Entity structuring pipeline compiles each of the normalized concepts identified in the previous stage, as normalization may directed to generate results which relate to the fields of the structure category identified, which the Examiner is interpreting the normalization to encompass scaling.)
As per claim 11, Lucas/Nguyen/Vianu discloses the method of claim 9 as described above. Lucas further teaches wherein each entity classification comprises a vector of entity classification scores for each of a plurality of entity classes (See Paragraphs [0158]-[0160]: The rule sets may include a vector of, for example, three hundred words and their respective weights, and each rule set may be applied over all words in a sentence to generate weights for every sentence), and wherein adjusting the base loss for each entity classification based on the plurality of training parameters, comprises: determining if an entity classification score from the vector of entity classification scores matches a target class from the list of target entity classes (See Paragraphs [0039]-[0042], [0103]-[0106], [0158]-[0160], [0206]: A combination of NLP and supervised, semi-supervised, or unsupervised MLA techniques may be used to generate an intelligent training set of data to recognize entries from the enumerated list of clinical drugs, in order to identify patterns within the text of abstracted documents which typically surround drug entries and individual words may be provided a weighting factor for probability of occurrence across a massive training set, which the Examiner is interpreting a weighting factor for probability of occurrence to encompass a set of loss adjustment factors for a list of target entity classes); and responding to the entity classification score matching the target class by: selecting a loss adjustment factor from the list of loss adjustment factors based on the target class (See Paragraphs [0103]-[0106]: Individual words may be provided a weighting factor for probability of occurrence across a massive training set, which the Examiner is interpreting a weighting factor for probability of occurrence to encompass a loss adjustment factor from the list of loss adjustment factors); and scaling the base loss for the entity classification score by the loss adjustment factor to produce an adjusted loss (See Paragraphs [0206]-[0208]: The Entity structuring pipeline compiles each of the normalized concepts identified in the previous stage, as normalization may directed to generate results which relate to the fields of the structure category identified, which the Examiner is interpreting the normalization to encompass scaling.)
As per claim 12, Lucas/Nguyen/Vianu discloses the method of claim 9 as described above. Lucas further teaches wherein selecting the training data pair comprises selecting the training data pair in a randomized order from a training data set (See Paragraph [0063]: The neural networks may be trained from a training data set, the training may include providing optimized datasets, labeling these traits as they occur in patient records, and training the MLA to predict or classify based on new inputs, which the Examiner is interpreting the provided optimized dataset to encompass a training data set and that the selection can occur in any order.)
As per claim 13, Lucas/Nguyen/Vianu discloses the method of claim 9 as described above. Lucas further teaches wherein storing the entity recognition model in the entity recognition model library includes indexing the entity recognition model according to the category from the plurality of pre-determined medical report categories (See Paragraphs [0167], [0249]: The highest ranked competing concept candidate may be preserved along with a reliability index, or a consolidated report of the most frequent competing concept candidates may be preserved along with their count values and/or reliability index, and the system also may check the database to determine whether improved NLP models have been provided and retrieve any new or updated models, which the Examiner is interpreting the reliability index to encompass indexing the entity recognition model according to the category.)
As per claim 14, Lucas/Nguyen/Vianu discloses the method of claim 9 as described above. Lucas further teaches wherein the plurality of training parameters further comprises a loss adjustment factor vector (See Paragraphs [0103]-[0106], [0159]-[0160]: Individual words may be provided a weighting factor for probability of occurrence across a massive training set, which the Examiner is interpreting a weighting factor for probability of occurrence to encompass a set of loss adjustment factor vector when utilized with the rule sets included vectors), and wherein adjusting the base loss for each entity classification based on the plurality of training parameters, comprises: selecting a loss adjustment factor from the loss adjustment factor vector based on the corresponding ground truth entity annotation (See Paragraphs [0103]-[0106]: Individual words may be provided a weighting factor for probability of occurrence across a massive training set, which the Examiner is interpreting a weighting factor for probability of occurrence to encompass a loss adjustment factor from the list of loss adjustment factors when combined with Vianu); and scaling the base loss by the loss adjustment factor to produce an adjusted loss (See Paragraphs [0206]-[0208]: The Entity structuring pipeline compiles each of the normalized concepts identified in the previous stage, as normalization may directed to generate results which relate to the fields of the structure category identified, which the Examiner is interpreting the normalization to encompass scaling.)
As per claim 15, Lucas/Nguyen/Vianu discloses the method of claim 9 as described above. Lucas further teaches wherein each entity classification comprises a vector of entity classification scores for each of a plurality of entity classes (See Paragraphs [0158]-[0160]: The rule sets may include a vector of, for example, three hundred words and their respective weights, and each rule set may be applied over all words in a sentence to generate weights for every sentence), wherein the plurality of training parameters further comprises a loss adjustment factor vector (See Paragraphs [0103]-[0106], [0159]-[0160]: Individual words may be provided a weighting factor for probability of occurrence across a massive training set, which the Examiner is interpreting a weighting factor for probability of occurrence to encompass a set of loss adjustment factor vector when utilized with the rule sets included vectors), and wherein adjusting the base loss for each entity classification based on the plurality of training parameters, comprises: scaling the base loss for each entity classification score in the vector of entity classification scores by a corresponding loss adjustment factor from the loss adjustment factor vector, to produce an adjusted loss (See Paragraphs [0206]-[0208]: The Entity structuring pipeline compiles each of the normalized concepts identified in the previous stage, as normalization may directed to generate results which relate to the fields of the structure category identified, which the Examiner is interpreting the normalization to encompass scaling.)
Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Lucas et al. (U.S. Patent Pre-Grant Publication No. 2020/0176098) in view of Nguyen et al. (U.S. Patent Publication No. 10,496,884) in view of Vianu et al. (U.S. Patent Pre-Grant Publication No. 2020/0334809) in further view of Torres et al. (U.S. Patent Pre-Grant Publication No. 2021/0209513).
As per claim 16, Lucas/Nguyen/Vianu discloses the method of claims 9 and 15 as described above. Lucas further teaches wherein the base loss is a base loss vector comprising a plurality of losses for the plurality of entity classes (See Paragraphs [0161]-[0163]: When a level of certainty lies below a threshold value (e.g., 90%), the whole document classifier may output the highest level vector calculated identifying, for example, a 60% confidence male and 40% confidence female, which the Examiner is interpreting the highest level vector to encompass a base loss vector comprising a plurality of losses for the plurality of entity classes.)
While Lucas/Nguyen/Vianu discloses the method as described above, Lucas/Nguyen/Vianu may not explicitly teach wherein scaling the base loss for each entity classification score in the vector of entity classification scores by the corresponding loss adjustment factor from the loss adjustment factor vector comprises taking a dot product of the base loss vector and the loss adjustment factor vector.
Torres teaches a method wherein scaling the base loss for each entity classification score in the vector of entity classification scores by the corresponding loss adjustment factor from the loss adjustment factor vector (See Paragraph [0043]: The combined loss function may combine errors from the objective function for each training task, by minimizing the combined loss function, the base model can be trained to maximize performance on two or more training tasks, which the Examiner is interpreting to encompass the claimed portion when combined with Lucas/Nguyen/Vianu) comprises taking a dot product of the base loss vector and the loss adjustment factor vector (See Paragraphs [0043]-[0045]: The task specific output layer may generate a prediction by computing the dot product between the final output matrix generated by the encoder with its weight matrix, adding its bias vector, and passing the output through an activation function to transform the vector values to word values.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed to modify the method of Lucas/Nguyen/Vianu to include scaling the base loss for each entity classification score in the vector of entity classification scores by the corresponding loss adjustment factor from the loss adjustment factor vector comprises taking a dot product of the base loss vector and the loss adjustment factor vector as taught by Torres. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Lucas/Nguyen/Vianu with Torres with the motivation of improve overall data quality by eliminating common causes of human error including distraction or fatigue (See Detailed Description of Several Embodiments of Torres in Paragraph [0010]).
Response to Arguments
In the Remarks filed on January 2, 2026, the Applicant argues that the newly amended and/or added claims overcome the 35 U.S.C. 103 rejection(s). The Examiner does not acknowledge that the newly added and/or amended claims overcome the 35 U.S.C. 103 rejection(s).
The Applicant argues that:
(1) none of the cited references, even if considered in combination, disclose or suggest "wherein the distinct set of loss adjustment factors are applied during training to the common training dataset without reducing a quantity of training samples from any entity class," as required by amended claim 1. While the Office cites Nguyen as disclosing "loss adjustment factors for a list of target entity classes to account for class imbalances" at pages 9-10 of the Office action, citing to col. 16, ll. 53-67, col. 17, ll. 1-20, and col. 25, ll. 26-59, Nguyen's approach fundamentally differs from the claimed invention in both methodology and technical implementation. Nguyen explicitly teaches addressing class imbalance through data reduction techniques, specifically "deflation methods." The cited portions of Nguyen describe identifying certain groups and then reducing the quantity of training samples from those groups to achieve balance. In contrast, amended claim 1 now explicitly recites that "the distinct set of loss adjustment factors are applied during training to the common training dataset without reducing a quantity of training samples from any entity class." This is a fundamentally different technical approach that uses all training samples from all entity classes throughout the entire training process and addresses class imbalances by setting specific loss adjustment factors. By setting loss adjustment factors greater than the threshold value for underrepresented classes and equal to or less than the threshold value for overrepresented classes, the model learns to prioritize accurate identification of underrepresented entities while still learning from all available data. This specific approach to handling class imbalances provides distinct technical advantages that are not taught by the cited references. For example, by preserving the complete training dataset rather than artificially restricting certain categories of data, the claimed method reduces the risk of overfitting that can occur when the model is trained on an artificially balanced but incomplete dataset. Additionally, the claimed method achieves more efficient knowledge extraction by learning from all available training examples while still addressing class imbalances through targeted loss adjustments. This allows the model to learn contextual relationships and patterns that would be lost if training samples were removed. Further, the claimed approach avoids the computational overhead and complexity of determining which samples to exclude or how to artificially balance the dataset through data manipulation. The remaining cited references fail to cure the deficiencies of Nguyen. Whether or not Lucas discusses general weighting factors, Lucas does not teach preserving all training samples while applying distinct loss adjustment factors to different entity classes to account for class imbalances. For at least the reasons presented above, Applicant respectfully requests that the rejections under 35 U.S.C. 103 of claim 1 and all claims depending therefrom be withdrawn. Applicant asserts that similar arguments and amendments as presented above with respect to claim 1 also apply to claim 17. As such, for at least the reasons provided above, Applicant respectfully requests that the rejection of claim 17 and all claims depending therefrom be withdrawn;
(2) Applicant asserts that similar arguments and amendments as presented above with respect to claim 1 also apply to claim 9. As such, for at least the reasons provided above, Applicant respectfully requests that the rejections of claim 9 and all claims depending therefrom be withdrawn. Additionally, none of the cited references, even if considered in combination, disclose or suggest "mapping each of the plurality of embedding vectors to a corresponding entity classification to produce the list of entity classifications, wherein each entity classification comprises a vector of entity classification scores for a plurality of entity classes," as required by amended claim 9. The Office cites paragraphs [0157]-[0160] of Lucas at page 19 of the Office action, stating that "each respective sentence may be assigned a sentence vector" and interpreting "the vector to encompass a token." However, this interpretation is incorrect for several reasons. First, the Office has not established that sentence vectors are entity classification vectors. Lucas describes assigning vectors to sentences, documents, and patients for general document classification purposes, such as determining gender demographics. These are not entity classification vectors that provide scores for a plurality of named entity classes. Second, the Office has not established that tokens are entity classifications. The Office's interpretation that a "vector" encompasses a "token" conflates fundamentally different concepts without support in evidence of record. The Office has not sufficiently established how a token, typically a discrete unit of text, such as a word or subword, is equated to an entity classification vector. Third, Lucas does not teach generating, for each token or embedding vector, a corresponding vector of entity classification scores where each score represents the probability or confidence that the token belongs to a specific entity class from a plurality of entity classes as claimed. The remaining cited references fail to cure the deficiencies of Lucas. Nguyen discusses image classification networks that output study-level classifications, as described in col. 23, 11. 44-67 and col. 24, 11. 1-22, not token-level entity classification vectors with class-specific scores for named entity recognition. Nguyen's networks classify entire medical images into categories, such as normal versus abnormal, which is fundamentally different from generating entity classification score vectors for individual tokens in text. Further, while Vianu discusses feature vectors in paragraph [0061], these are described in the context of uncertainty estimation and Bayesian approaches to assessing physician accuracy, not in the context of generating entity classification vectors with class-specific scores for named entity recognition in medical reports. Vianu does not teach mapping embedding vectors to entity classification vectors comprising scores for a plurality of entity classes as claimed.
In response to argument (1), the Examiner does not find the Applicant’s argument(s) persuasive. The Examiner maintains that Nguyen teaches “wherein the distinct set of loss adjustment factors are applied during training to the common training dataset without reducing a quantity of training samples from any entity class,” as required by amended claim 1, Nguyen in col. 16, ll. 53-67, col. 17, ll. 1-20, col. 25, ll. 26-59: col. 16, ll. 53-67 and col. 17, 1-20 discuss class balance, which the Examiner is interpreting to encompass a list of target entity classes to account for class imbalances as the flattening block can flatten the input into a vector, and the study score and associated loss can then be calculated, based on a study label, and in col. 15, ll. 58-67, col. 16, ll. 1-12 as the hinge loss functions of Nguyen are used to adjust the network parameters to optimize performance an accuracy of the network, which the Examiner is interpreting the calculation of hinge loss to encompass a distinct set of loss adjustment factors. The network parameters of Nguyen are described as weights that are determined during a training phases so as to recognize patterns (col. 8, ll. 2-16). The Examiner is interpreting the adjustment of the network parameters by the hinge loss functions to encompass “wherein the distinct set of loss adjustment factors are applied during training to the common training dataset without reducing a quantity of training samples from any entity class,” as this adjustment occurs before the class balancing. The Examiner maintains that the combination of Lucas/Nguyen encompasses newly amended independent claim 1. The 35 U.S.C. 103 rejection(s) stand.
In response to argument (2), the Examiner does not find the Applicant’s argument(s) persuasive. The Examiner maintains that Lucas’ teaching in Paragraphs [0094]-[0096], [0160]-[0162], [0201] teach “mapping each of the plurality of embedding vectors to a corresponding entity classification to produce the list of entity classifications, wherein each entity classification comprises a vector of entity classification scores for a plurality of entity classes,” as described above in the 35 U.S.C. 103 rejection(s). Lucas in Paragraphs [0160]-[0162] describe when a level of certainty lies below a threshold value (e.g., 90%), the whole document classifier may output the highest level vector calculated identifying, for example, a 60% confidence male and 40% confidence female, which the Examiner is interpreting the whole document classifier may output the highest level vector to encompass a vector of entity classification scores for a plurality of entity classes. The Examiner maintains that Lucas’ teaching in Paragraph [0119] of “These deficiencies in sentence splitting may be overcome by adding models before this stage to identify whether text is semi-structured data, well-formed text, clinical shorthand, uninformative headers/footers, etc. By creating methods for distinguishing between these types of text, the intake pipeline may use specific models to extract information from each type. For example, complex sentences may be broken down into simple sentences by looking for coordination constructs, adjectival clauses, evaluating parataxis, prepositional phrases, etc., by applying phrase-based or syntax-based machine translation approaches. For sentences which are well-structured (e.g., following traditional grammar and prose), parse trees or deep semantic representations may be utilized. For sentences which are noisy (e.g, structured, but with unclear boundaries), a maximum entropy approach may be utilized. In texts which are very specialized in nature (e.g., medical texts, legal texts, etc.), a tokenization and document segmentation algorithm may be applied. By implementing sentence splitting, the processing pipeline may split the document into sentences for individual parsing.” identifies that the tokenization of the data may be available when utilizing the tokenization and document segmentation algorithm. The Examiner maintains that the “confidence value” and “level of confidence” described in Paragraphs [0161]-[0162], [0252] teaches “generating, for each token or embedding vector, a corresponding vector of entity classification scores where each score represents the probability or confidence that the token belongs to a specific entity class from a plurality of entity classes”. The 35 U.S.C. 103 rejection(s) stand.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Bennett S Erickson whose telephone number is (571)270-3690. The examiner can normally be reached Monday - Friday: 9:00am - 5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Robert Morgan can be reached at (571) 272-6773. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Bennett Stephen Erickson/ Primary Examiner, Art Unit 3683