Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Election/Restrictions
In response to the office action mailed 08/19/2024, applicant elected claims 13-20. Claims 1-12 are withdrawn. The pending claims are 13-20.
Claim Objections
Claims 17 and 19 are objected to because of the following informalities: Appropriate
The claims contain acronyms "DAG", "BERT", “Seq2Seq”, and “LSTM”. However, no definition is given. Using just an acronym without its definition creates ambiguity in the claim language. For the purposes of this examination examiner will treat DAG as referring to “Directed acyclic graph”, “BERT” referring to “Bidirectional Encoder Representations from Transformers”, “Seq2Seq” referring to “Sequence-to-Sequence”, and “LSTM” referring to “Long Short-Term Memory”. correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 16-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 16 recites “applying a conditional random field (CRF) to an output of the attention layer to infer a label sequence with a highest probability given the message context”. Claim 16 depends on claim 15, which recites “wherein the received message comprises a contextual message and a received message”. The examiner is not clear on which message, of claim 15, applicant is referring to in claim 16.
Claims 17-20 are rejected for being dependent on claim 16.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 13-16 are rejected under 35 U.S.C. 103 as being unpatentable over Du et al. (“Journal of the American Medical Informatics Association”, published June 24th, 2019, pages 1279 – 1285) in view of Lu (US 11017173).
As per claim 13, Du et al. teaches a computerized method for implementing a neural architecture for hierarchical sequence labelling (Fig. 1) comprising:
providing a neural architecture comprising a set of labelling layers, wherein the neural architecture uses a multi-pass approach on the set of labelling layers (page 1280, col. 1, line 43-45 and page 1281, Fig. 1, wherein said the labeling network consists of several fully connected layers and an output layer for classification/labeling of the input text document),
receiving and parsing an input sentence (page 1280, col. 1, line 45-49, a document encoding network that takes raw text as the input);
embedding the input sentence into a corresponding character vector and a corresponding word vector to generate a feature vector (page 1280, col. 1, line 45-68, a document encoding network that takes raw text as the input and outputs the high dimensional vectors that represent the entire textual document);
passing the feature vector through the neural architecture (Fig. 1, wherein the feature vector is input into the neural network architecture); and
performing a multi-layer labelling procedure on the feature vector with the neural architecture (page 1281, Fig. 1, wherein multiple fully connected layers of the multi-layer perceptron (MLP) neural networks are used for label prediction) comprising:
augmenting a set of corresponding bits of the feature vector, wherein the feature vector is passed through the set of labelling layers of neural architecture, wherein each subsequent layer of the neural architecture comprises a same neural architecture with a new set of labels and produces an augmented version of the feature vector, wherein the feature vector is initially empty at a first layer of the set of labelling layers, wherein at the end of each layer of the set of labelling layers additional information is added to the feature vector such that each subsequent layer has an additional context when a labelling action is performed during a subsequent layer (Fig. 1 and page 1280, col. 2, line 19-37, wherein a feature vector corresponding to the received raw text document is input into a sequence of fully connected layers, each subsequent layer adds its corresponding context to the labeling of the received vector. More, the first fully connected layer takes a document vector as the input and outputs a predicted confidence score for each label). Du may not explicitly disclose embedding the input sentence into a corresponding character vector and a corresponding word vector to generate a feature vector. Lu in the same field of endeavor teaches receiving an input sentence (col. 11, line 66-67, FIG. 7A, receiving a caption input by the user of the client device); parsing the input sentence (col. 12, line 1-3, analyzing each word in the caption of the multimodal message); and embedding the input sentence into a corresponding character vector and a corresponding word vector to generate a feature vector (col. 12, line 1-10, embedding the received message and generating words and characters vectors). Therefore, it would have been obvious at the time the application was filed to use the above features of Lu with the system of Du, in order to reduce computational cost and improve accuracy in natural language processing.
As per claim 14, Du teaches providing an attention layer of the neural architecture, wherein the attention layer: receives a received message represented as a vector at a different time step; determines a focus of each piece of information in the received message; and captures a contextual information of the received message and based on the contextual information reducing a noise present in one or more message representations (Fig. 1, page 1281, and page 1280, col. 1, line 45-59, at the attention layer, context embedding is performed on input raw text and corresponding vectors are provided to several fully connected layers at the label decision level) .
As per claim 15, Du may not explicitly disclose wherein the attention layer in the neural architecture comprises a dot product type which uses a dot product of a scores matrix and a set of encoder states to calculate a final score, and wherein the received message comprises a contextual message and a received message. Lu in the same field of endeavor teaches wherein the attention layer in the neural architecture comprises a dot product type which uses a dot product of a scores matrix and a set of encoder states to calculate a final score, and wherein the received message comprises a contextual message and a received message (col. 14, line 40-67, and col. 16, line 1-15). Therefore, it would have been obvious at the time the application was filed to use the above features of Lu with the system of Du, in order to quantify vectors alignment and provide accurate labeling results.
As per claim 16, Du may not explicitly disclose applying a conditional random field (CRF) to an output of the attention layer to infer a label sequence with a highest probability given the message context. Lu in the same field of endeavor teaches applying a conditional random field (CRF) to an output of the attention layer to infer a label sequence with a highest probability given the message context (col. 14, lines 12-21, 33-36. Col. 15, 15-22). Therefore, it would have been obvious at the time the application was filed to use the above features of Lu with the system of Du, in order to enable modeling dependencies and relationships between labels in crucial tasks such as data labeling.
Claims 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Du et al. (“Journal of the American Medical Informatics Association”, published June 24th, 2019, pages 1279 – 1285) in view of Lu (US 11017173), and further in view of Rossi (US 20200082273).
As per claim 17, Du in view of Lu does not explicitly disclose using of one or more DAGFrames for layer-based labelling. Rossi in the same field of endeavor teaches a neural network with multiple hidden layers, wherein a directed acyclic graph (DAG) including directed edges between two respective nodes is used ([0018], [0044], [0045] [0050]). Therefore, it would have been obvious at the time the application was filed to use the above features of Rossi with the system of Du in view of Lu, in order to improve natural language processing and debugging errors.
As per claim 18, Du may not explicitly disclose wherein in a Bidirectional LSTM is used for sequence labelling by the neural architecture. Lu in the same field of endeavor teaches using a Bidirectional LSTM for sequence labelling by the neural architecture (col. 14, line 1-3, Bidirectional LSTM). Therefore, it would have been obvious at the time the application was filed to use the above features of Lu with the system of Du, in order to maintain context across long sentences or paragraphs, leading to more accurate topic or intent labeling.
As per claim 19, Du teaches using a seq2seq by the neural architecture (page 1281, col. 2, 9-13 and table 1 on page 1282). Du in view of Lu does not explicitly disclose the use of DAGFrame by the neural architecture. Rossi in the same field of endeavor teaches a neural network with multiple hidden layers, wherein a directed acyclic graph (DAG) including directed edges between two respective nodes is used ([0018], [0044], [0045] [0050]). Therefore, it would have been obvious at the time the application was filed to use the above features of Rossi with the system of Du in view of Lu, in order to improve natural language processing and debugging errors.
Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Du et al. in view of Lu and Rossi, and further in view of Watson (US 20200012890).
As per claim 20, Du in view of Lu does not explicitly disclose wherein the set of labelling layers present in the neural architecture are numbered 0 through 4. However, selecting a number of layers for a neural network is well known in the art as evidenced by Watson ([0063]). Therefore, it would have been obvious at the time the application was filed to use Watson’s above feature with the system of Du in view of Lu and Rossi, in order to number the labeling layers 0 through 4, as claimed. This would improve accuracy on complex tasks.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. See PTO-892.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ABDELALI SERROU whose telephone number is (571)272-7638. The examiner can normally be reached M-F 9 Am - 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached at 571-272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ABDELALI SERROU/ Primary Examiner, Art Unit 2659
Abdelali.serrou@uspto.gov