DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in response to the Preliminary Amendment filed on June 28, 2024.
Claims 1-7, 9-10, 13-23 are pending in this action. Claims 9-10 have been amended. Claims 8 and 11-12 have been canceled. Claims 13-23 have been newly added.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1 and 9-10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (CN 111723913A) in view of Fu et al. (US 2020/0219486).
As per claim 1, Liu discloses, a speech recognition method (Fig. 1-8), comprising:
inputting a to-be-recognized speech segment to a Long Short Term Memory (LSTM) model (Fig. 1, element S101, obtaining the media object to be processed, and inputting the media object to the LSTM network. wherein the media object can be speech recognition corresponding to speech segment);
processing the to-be-recognized speech segment through the LSTM model so as to obtain a speech recognition result (Fig. 1, It should be noted that the LSTM network is the LSTM network capable of processing the media object. That is, in this embodiment, the LSTM network can be specifically trained for speech recognition, machine translation, language modeling, emotion analysis, text prediction in any one of specific application of the network.);
wherein the LSTM model comprises at least one processing layer, each of the processing layers comprises a plurality of processing units respectively, each of the processing units determines an output quantity for the target time of the corresponding unit through two single loops based on an input data set of the corresponding unit and a historical state data set prior to a target time, the target time is a time corresponding to the input data set of the corresponding unit, and the output quantity for each time prior to the target time comprises the historical state data set prior to the target time (Fig. 2, Formula (1)-(14), target data generated while the LSTM network is processing the media object and to be processed by utilizing a gate structure is acquired, the target data may specifically comprise: Wx (an input weight data matrix), Wh (a hidden state weight data matrix), X (input data) (input dataset), H (hidden state data) (past state dataset), and B (a bias data matrix); the target data is rearranged by utilizing a parallelism parameter of an FPGA to produce parallel data; matrix vector multiplication processing is performed with respect to the parallel data by utilizing a matrix-vector multiplication unit group in the FPGA to produce a processing result; each time the multiplication units complete P multiplication operations, the results are output to an addition unit; results output by the addition unit are accumulated by an accumulation unit; because the parallelism is P, the accumulation unit completes the accumulation of Nh/P pieces of data to produce the accumulated result of multiplying a column of a weight matrix by X (or H), and outputs the result to the next level; the parallelism parameter may be 2 (an LSTM model comprises at least one processing layer; each processing layer comprises multiple processing units; each processing unit determines, on the basis of an input dataset and of a past state dataset corresponding to the unit, an output amount corresponding to a unit target moment; the output of the preceding processing layer in adjacent two processing layers serves as the input of the subsequent processing layer); The data processing device may comprise one or more processors and a memory; a computer program is stored in a readable storage medium; and the computer program, when executed by the processor, implements the steps of the data processing method of the method embodiment stated above.;
wherein an output of a former processing layer of two adjacent processing layers serves as an input of a latter processing layer, and an output of a former processing unit of two adjacent processing units serves as an input of a latter processing unit (Fig. 2, the output of the preceding processing unit in two adjacent processing units serves as the input of the subsequent processing unit; and the processing result is fed back to the LSTM network for continued processing to produce an output result of the media object);
wherein an input data set of a first processing layer of the LSTM model
Liu does not explicitly disclose, but Fu discloses, wherein an input data set of a first processing layer of the LSTM model comprises vectors of a plurality of recognition units respectively corresponding to each audio frame in the to-be- recognized speech segment (Fig. 6, Paragraphs 0048-0052).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Liu by including processing layer of the LSTM model comprises vectors of a plurality of recognition units respectively corresponding to each audio frame in the to-be- recognized speech segment as taught by Fu so as to performance of the speech recognition system is increases (Paragraph 0049).
As per claims 9 and 10, they are analyzed and thus rejected for the same reasons set forth in the claim 1, because corresponding claims have similar limitations.
Allowable Subject Matter
Claims 2-7 and 13-23 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The prior art of record fails to teach or fairly suggest in combinations with other limitations particularly, “wherein the output quantity of each of the processing units is determined based on a sum of a product of the input data set of the corresponding unit and a first matrix and a product of the historical state data set of the corresponding unit and a second matrix; the input data set comprises a plurality of input subsets, and the historical state data set comprises a plurality of historical state subsets”.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Wang et al. (CN 113284515A) discloses, voice emotion recognition method based on physical wave and loop network.
Wang et al. (US 2020/0027462) discloses, voice control system, wakeup method and wakeup apparatus therefore, electrical appliance and co-processor.
Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Abul K. Azad whose telephone number is (571) 272-7599. If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Bhavesh Mehta, can be reached at (571) 272-7453.
Any response to this action should be mailed to:
Commissioner for Patents
P.O. Box 1450
Alexandria, VA 22313-1450
Or faxed to: (571) 273-8300.
Hand-delivered responses should be brought to 401 Dulany Street, Alexandria, VA-22314 (Customer Service Window).
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
February 17, 2026
/ABUL K AZAD/Primary Examiner, Art Unit 2656