DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-7 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. For example, the independent method claim 6 recites “…converting, by an encoder, an input speech signal into a characteristic expression; converting, by a decoder, speech data into text using an output of the encoder; and randomly selecting, by a learning unit, a block length of the speech signal input to the encoder and causing the encoder and the decoder to perform learning.”
The limitation of “converting …”, “converting …”, and “selecting …”, as drafted covers mathematical concepts - representing mappings (functions) that transform data from one form to another. More specifically, an encoder and decoder models are mathematic equations that can be calculated by a human on piece of paper or human mind (see pages 12-16 of instant specification). Therefore, it appears to be merely an abstract idea (i.e., mathematical concepts).
This judicial exception is not integrated into a practical application. In particular, claim 7 recites additional element of “computer”. For example, in page 29, lines 09-30 of the instant specification, appears to describe a generic computer. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional element of using a computer is noted as a general computer as noted. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible under 35 U.S.C. 101.
Claims 1-5 and 7 are directed to substantially the same subject matter as independent claim 6 and are rejected under similar rationale and further failure to add significantly more.
Claim 7 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. Claim 7 recites “computer-readable non-temporary storage medium”. The specification in para 0170 states: “The ‘computer-readable recording medium’ refers to a portable medium such as a flexible disk, a magneto-optical disk, a ROM, and a CD-ROM, or a storage device such as a hard disk built into a computer system. Further, it is assumed that the “computer-readable recording medium” includes a medium in which a program is held for a certain period of time, like a volatile memory (RAM) inside a computer system that serves as a server or client when a program is transmitted via a network such as the Internet or a communication line such as a telephone line.” There is no clear definition of “non-temporary storage medium”. “Subject Matter Eligibility of Computer Readable Media” in Official Gazette Notice 1351 OG 212 (February 23, 2010) states: “The broadest reasonable interpretation of a claim drawn to a computer readable medium (also called machine readable medium and other such variations) typically covers forms of non-transitory tangible media and transitory propagating signals per se in view of the ordinary and customary meaning of computer readable media, particularly when the specification is silent. See MPEP 2111.01. When the broadest reasonable interpretation of a claim covers a signal per se, the claim must be rejected under 35 U.S.C. § 101 as covering non-statutory subject matter.”
As such, it can interpreted as a “signal” in which a "signal" embodying functional descriptive material is neither a process ("actions"), machine, manufacture nor composition of matter (i.e., tangible "thing") and therefore does not fall within one of the four categories of § 101. Rather "signal" is a form of energy, in the absence of any physical structure or tangible material. Examiner recommends using “computer-readable non-transitory storage medium” to overcome the rejection. Please see, “Subject Matter Eligibility of Computer Readable Media” in Official Gazette Notice 1351 OG 212 (February 23, 2010). Appropriate correction is required.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-2, and 6-7 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Yao et al. (“WeNet: Production Oriented Streaming and Non-streaming End-to-End Speech Recognition Toolkit”).
Regarding claim 1, Yao discloses a speech recognition device for performing speech recognition using an end-to-end model (see abstract), the speech recognition device comprising:
an encoder that is a model that converts an input speech signal into a characteristic expression (see section 2.1; fig. 1);
a decoder that is a model that converts speech data into text using an output of the encoder (see section 2.1 –“ The CTC Decoder consists of a linear layer, which transforms the Shared Encoder output to the CTC activation while the Attention Decoder consists of multiple Transformer decoder layers”); and
a learning unit configured to randomly select a block length of the speech signal input to the encoder and cause the encoder and the decoder to perform learning (section 2.1.1 Training – “We adopt a dynamic chunk training technique to unify the non-streaming and streaming model…Secondly, the chunk size is varied dynamically from 1 to the max length of the current training utterance in the training, so the trained model learns to predict with arbitrary chunk size”).
Regarding claim 2, Yao discloses wherein the encoder includes a first encoder configured to perform processing online and a second encoder configured to perform processing offline (section 2.1.1 – first encoder is shared encoder operating in streaming mode with limited chunk size and second encoder is shared encoder operating in non-streaming mode), the decoder includes a first decoder configured to perform processing online (section 2.1 – “CTC Decoder runs in a streaming mode in the first pass”) and a second decoder configured to perform processing offline (section 2.1 – “the Attention Decoder is used in the second pass to give a more accurate result”), and the speech recognition device further comprises: an integration unit configured to rescore an output of the first decoder and an output of the second decoder at the time of inference to output a final recognition result (section 2.1.2 and also see section 3.1).
Regarding claim 6, see rejection of claim 1.
Regarding claim 7, see rejection of claim 1.
Allowable Subject Matter
Claims 3-5 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims and overcoming the rejection(s) under 35 U.S.C. 101, set forth in this Office action.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NAFIZ E HOQUE whose telephone number is (571)270-1811. The examiner can normally be reached M-F 8-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ahmad Matar can be reached at (571)272-7488. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/NAFIZ E HOQUE/ Primary Examiner, Art Unit 2693