CTNF 18/176,252 CTNF 89039 DETAILED ACTION Notice of Pre-AIA or AIA Status 07-03-aia AIA 15-10-aia The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA. Allowable Subject Matter 12-151-08 AIA 07-43 12-51-08 Claim s 5, 12 and 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. Claim Rejections - 35 USC § 102 07-06 AIA 15-10-15 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 07-07-aia AIA 07-07 The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – 07-08-aia AIA (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention. 07-15 AIA Claim (s) 1-4, 6-11, 13-18 and 20 are rejected under 35 U.S.C. 102( a)(1 ) as being anticipated by Kwon (NPL: “Voice Frequency Synthesis using VAW GAN based Amplitude Scaling for Emotion Transformation”) . With respect to claim 8 (similarly claims 1 and 15) , Kwon teaches an apparatus (e.g. the apparatus used to conduct the experiment, section 4.2 the first 4 lines): audio interface circuitry (e.g. the audio interface of Fig 2 where data sources are received, see also the speech data used in this study, section 4.1 the first 3 lines); and one or more processors (e.g. the processors of section 4.2 the first 4 lines) to execute instructions to: identify a first vocal effort of a first audio segment of first audio data and a second vocal effort of a second audio segment of the first audio data (e.g. the emotion classes that the speech represents are: angry and disgust expressions amongst other expressions, see section 4.1 the fourth line suggest identifying angry/a first vocal effort of a first expression/audio segment of the speech and disgust/second vocal effort of a second expression/audio segment of the speech); train a neural network including training data (e.g. training and voice transformation Fig 2, train VAW-GAN-based generation model, section 3.2 including training data used in this study, see section 3.1 and/or 4.1), the training data including the first vocal effort, the first audio segment, the second audio segment, and the second vocal effort (e.g. the training data including angry, the first expression, the second expression and disgust, as suggest in section 4.1 fourth line); and deploy the neural network (e.g. deploy the VAW-GAN-based generation model of section 3.2, as suggest in section 4.2, for evaluation), the neural network to distinguish between the first vocal effort and the second vocal effort (e.g. the VWA-GAN-based generation model to distinguish between angry and disgust, as suggest in section 3.1, Fig 3 which shows SP and F0, the features of the speech data with different emotion classes in the same sentence). With respect to claim 9 (similarly claims 2 and 16) , Kwon teaches the apparatus of claim 8, wherein the one or more processors executes the instructions to preprocess the first audio segment by extracting linear predictive coefficients from the first audio segment (e.g. the structure of voice frequency analysis of Fig 2 includes data pre-processing and feature extraction with the use of GAN, which is applied in section 4.2 where Mel-spectral coefficients (MCEP) seem to be extracted, see equation 2 of section 4.2, suggesting the processors executes the instructions to preprocess the first audio segment by extracting linear predictive coefficients from the first audio segment), the linear predictive coefficients including a time-frequency representation of the first audio segment (e.g. Fig 7 section 4.1 shows the linear predictive coefficients including a time-frequency representation of the first audio segment i.e. angry). With respect to claim 10 (similarly claims 3 and 17) , Kwon teaches the apparatus of claim 8, wherein the training data further includes a third audio segment including a regular vocal effort, a fourth audio segment including a loud vocal effort, and a fifth audio segment including a yelled vocal effort (e.g. the emotion classes that the speech represents are happy, calm, sad, fearful, angry, surprise, and disgust expressions, section 4.1 fourth line, which expressions include a third audio segment including a regular vocal effort, a fourth audio segment including a loud vocal effort, and a fifth audio segment including a yelled vocal effort in addition to angry and disgust selected in the rejection of claim 8 as the first and second vocal efforts). With respect to claim 11 (similarly claims 4 and 18) , Kwon teaches the apparatus of claim 8, wherein the one or more processors executes the instructions to: analyze, via the neural network (e.g. analyze via VAW-GAN generation model, as suggested in section 4), a third audio segment (e.g. a third expression, as suggested in section 4.1 fourth line) to determine a third vocal effort of the third audio segment (e.g. to determine a third vocal effort of the third expression whereby the third vocal effort is among the unselected expressions of section 4.1 fourth line, like surprise in Fig 6); and output, via the neural network, metadata including an indication corresponding to the third vocal effort (e.g. output via VAW-GAN generation model, metadata including an indication corresponding to the third vocal effort, as suggested in Fig 3 section 3.1 for neutral, angry and disgust), the indication having a first value when the third vocal effort is a whispered vocal effort, the indication having a second value when the third vocal effort is a soft vocal effort, the indication having a third value when the third vocal effort is neither the soft vocal effort or the whispered vocal effort (e.g. Fig 3 section 3.1 and Fig 6 disclosing the indication having values when the vocal effort is angry, disgust, surprise and neutral suggest the indication having a first value when the third vocal effort is a whispered vocal effort, the indication having a second value when the third vocal effort is a soft vocal effort, the indication having a third value when the third vocal effort is neither the soft vocal effort or the whispered vocal effort when the remaining expressions of section 4.1 fourth line are processed similar to those processed in Figs 3 and 6). With respect to claim 14 (similarly claims 7 and 20) , Kwon teaches the apparatus of claim 8, wherein the one or more processors executes the instructions to identify the first vocal effort by identifying a presence of harmonics indicative of a whispered vocal effort in the first audio segment (e.g. Figs 3 and 6 identify the first vocal effort by identifying a presence of harmonics indicative of a whispered vocal effort in the first audio segment) and the identification of the second vocal effort including identifying an absence of harmonics indicative of a soft vocal effort in the second audio segment (e.g. Figs 3 and 6 suggest the identification of the second vocal effort including identifying an absence of harmonics indicative of a soft vocal effort in the second audio segment) . Claim Rejections - 35 USC § 103 07-06 AIA 15-10-15 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 07-20-aia AIA The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. 07-21-aia AIA Claim (s) 6 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Kwon (NPL: “Voice Frequency Synthesis using VAW GAN based Amplitude Scaling for Emotion Transformation”) in view of Guez (US 5,293,456) . With respect to claim 13 (similarly claim 6) , Kwon teaches the apparatus of claim 8 including the neural network. However, Kwon fails to teach wherein the neural network is a feed-forward fully layered neural network. Guez teaches a neural network that is a feed-forward fully layered neural network (e.g. a conventional, three-layer, fully-connected, feedforward neural network trained using back-propagation techniques is shown generally at 10 in FIG. 1, see col 1 ln 21-23). Kwon and Guez are analogous art because they all pertain to voice/object recognition systems. Therefore, it would have been obvious to people having ordinary skill in the art before the effective filing date of the claimed invention to modify Kwon with the three-layer, fully-connected, feedforward neural network of Guez, as disclosed in Fig 1 col 1 ln 21-23. The benefit would be to reduce the training time in neural networks having large input arrays, Guez col 2 ln 8-10. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to IBRAHIM SIDDO whose telephone number is (571)272-4508. The examiner can normally be reached 9:00-5:30PM. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Akwasi Sarpong can be reached at 5712703438. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /IBRAHIM SIDDO/Primary Examiner, Art Unit 2681 Application/Control Number: 18/176,252 Page 2 Art Unit: 2681 Application/Control Number: 18/176,252 Page 3 Art Unit: 2681 Application/Control Number: 18/176,252 Page 4 Art Unit: 2681 Application/Control Number: 18/176,252 Page 5 Art Unit: 2681 Application/Control Number: 18/176,252 Page 6 Art Unit: 2681 Application/Control Number: 18/176,252 Page 7 Art Unit: 2681