DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-3, 5-13, 15-20 are rejected under 35 U.S.C. 103 as being unpatentable over Lyu (US 11,664,043) in view of Chen (US 2012/0016672).
With respect to claim 1 (similarly claim 11), Lyu teaches an AI-based automated personality and behavior analytic and assessment system (e.g. the system of Figs 2-3), comprising:
a machine learning data repository storing unstructured and structured training data (e.g. training dataset 326 Fig 3 col 17 ln 10-14);
an input interface (e.g. audio capture 204 Fig 3) configured for receiving input data containing signals related to audio speech (e.g. the audio capture service 204 of the user device 102 may capture audio 316 Fig 3 col 14 ln 9-10, see also Fig 5 where speech data is received col 21 ln 3-22);
an automatic speech recognition (e.g. ASR 304 Fig 3 col 15 ln 49-54), diarization (e.g. an emotion recognition engine 310 Fig 3 col 16 ln 5-22), and transcription module (e.g. HMM col 15 ln 49-54) configured to receive the input data (e.g. receive audio 316 Fig 3), recognize the spoken words in the input data (e.g. ASR 304 recognizes the spoken words in the input data 316, see Fig 3), and generate an output representing the spoken words attributed to the individual (e.g. and generate an output representing the spoken words attributed to the user, see col 16 ln 5-22), the output including speaking moments (e.g. the output includes speaking moments, as suggested in Fig 5 S506) and audio slices (e.g. and audio segments as suggested in Fig 5 S506);
a machine learning text-based feature generation pipeline configured to receive the speaking moments (e.g. machine learning model bag-of-words receive the speaking moments, Fig 5 S506 col 21 ln 42-48) and generate a numerical text-based feature set (e.g. and generate values i.e. the tokenization process may break the filtered text down to the individual words “you”, “look”, “sexy”, and “baby.” These words may then be associated with values in a vector indicating the words themselves, the position of the words, and/or the frequency of occurrence of the words, col 21 ln 49-62);
a machine-learning audio-based feature generation pipeline configured to receive the audio slices (e.g. a machine learning model amongst the machine learning models of col 21 ln 42-48 receives the audio slices) and generate a numerical audio-based feature set (e.g. and generate the values in the vector representation of col 21 ln 49-62);
a machine learning inference processor (e.g. a prediction model col 21 ln 63-67-col 22 ln 1-4) configured to receive at least one of the numerical text-based feature set and the numerical audio-based feature set (e.g. receive the values of col 21 ln 49-62), develop inferences from the feature sets (e.g. apply the resultant of the vector representation to the prediction model, col 21 ln 63-65), and generate a set of scores representing the probabilities for a number of personality and behavioral traits (e.g. generate a probability of likelihood that the text segment includes harassing text col 22 ln 5-15, see also the results of the experiments in Figs 7-9, which represent a set of scores representing the probabilities for a number of personality and behavioral traits);
Even though Lyu produces the probability/set of scores that the text segment corresponding to the audio segment is harassing or associated with harassment col 22 ln 5-15, he fails to teach a user interface configured to present information related to the generated set of scores to a user.
Chen teaches a user interface configured to present information related to the generated set of scores to a user (e.g. a display configured to display a non-native speech proficiency score, see Fig 8 [0048] and claim 15).
Lyu and Chen are analogous art because they all pertain to processing text segment/words to produce probability/score to detect verbal harassment. Therefore, it would have been obvious to people having ordinary skill in the art before the effective filing date of the claimed invention to modify Lyu with graphical user interface of Chen to include: a user interface configured to present information related to the generated set of scores to a user, as suggested by Chen in Fig 8 [0048] and claim 15. The benefit of the modification would be to detect verbal harassment happening in real time, as suggested by Lyu in col 22 ln 63-67.
With respect to claim 2 (similarly claim 12), Lyu teaches the system of claim 1, wherein at least a portion of the input data comprises real-time speech data (e.g. it is desirable to detect harassment in real-time or near real-time (e.g., within seconds or minutes, or before the completion of a ride-sharing event) so that the harassment can be stopped and/or appropriate action (e.g., contacting the police or other authorities) can be taken, and in particular verbal harassment col 11 ln 66-67-col 12 ln 1-4 suggest at least a portion of the input data comprises real-time speech data).
With respect to claim 3 (similarly claim 13), Lyu teaches the system of claim 1, wherein at least a portion of the input data comprises a stored file containing audio data (e.g. audio segments associated with each order 320 may be stored at an audio storage repository 322 Fig 3 Col 17 ln 44-45 suggest at least a portion of the input data comprises a stored file containing audio data).
With respect to claim 5 (similarly claim 15), Lyu teaches the system of claim 1, wherein the machine learning interference processor is configured to perform text analysis, dictionary, and vector model processing on the speaking moments (e.g. Fig 5 S504-506 col 21 ln 36-62 suggest to perform text analysis, dictionary, and vector model processing on the speaking moments).
With respect to claim 6 (similarly claim 16), Lyu teaches the system of claim 1, wherein the user interface is configured to display the information related to the generated set of scores via the web browser-based interface, via an application programming interface, and in curated final reports in PDF format (e.g. the interior interface system 125 displays information related to the generated probability of Fig 5 S508 via the web browser-based interface, via an application programming interface, and in curated final reports in PDF format as suggested in col 9 ln 28-48).
With respect to claim 7 (similarly claim 17), Lyu teaches the system of claim 1, wherein the system is configured to align problem-relevant features to a inference model’s target space (e.g. align problem-relevant features to a inference model’s i.e. the prediction model of Fig 5 target space).
With respect to claim 8 (similarly claim 18), Lyu teaches the system of claim 1, wherein the system future comprises a large language model evaluation process configured to have an ensemble of large language models that are a mixture of fine-tuned and retrieval-augmented generative variants (e.g. model generation system 208 Fig 3-4 comprises a large language model evaluation process configured to have an ensemble of large language models that are a mixture of fine-tuned and retrieval-augmented generative variants).
With respect to claim 9 (similarly claim 19), Lyu teaches the system of claim 1, wherein the ensemble of large language models is configured to evaluate final inference model assumption requirements (e.g. model generation system 208 Fig 3-4 to evaluate final inference model 306 assumption requirements, as suggested in Fig 5).
With respect to claim 10 (similarly claim 20), Lyu teaches the system of claim 9, wherein the assumption requirements are configured to include an establishment of a cognitive load-inducing situation characterizing the statement being assessed, and an evaluation of inference model appropriateness for statement content (e.g. Fig 3-4 suggest the assumption requirements include an establishment of a cognitive load-inducing situation characterizing the statement being assessed, and an evaluation of inference model appropriateness for statement content).
Claim(s) 4 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Lyu (US 11,664,043) in view of Chen (US 2012/0016672) and further in view of Vaillancourt (US 2011/0046947).
With respect to claim 4 (similarly claim 14), Lyu teaches the system of claim 1 including the machine learning interference processor.
However, Lyu fails to teach wherein the machine learning interference processor is configured to perform Fast Fourier Transform spectral and signal analysis processing on the audio slices.
Vaillancourt teaches a spectral analyzer 105 in Fig 3 to perform Fast Fourier Transform spectral and signal analysis processing on the audio slices (e.g. DFT (Discrete Fourier Transform) is used in the spectral analyzer 105 to perform a spectral analysis and spectrum energy estimation of the pre-emphasized decoded tonal sound signal 106 [0038]).
Lyu and Vaillancourt are analogous art because they all pertain to processing audio/sound signals. Therefore, it would have been obvious to people having ordinary skill in the art before the effective filing date of the claimed invention to modify the machine learning interference processor of Lyu with the spectral analyzer 105 of Vaillancourt to teach wherein the machine learning interference processor is configured to perform Fast Fourier Transform spectral and signal analysis processing on the audio slices, as suggested in [0038]. The benefit of the modification would be to distinguish the audio slices from noise to safely remove the noise, as suggested in Fig 5 of Lyu.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to IBRAHIM SIDDO whose telephone number is (571)272-4508. The examiner can normally be reached 9:00-5:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Akwasi Sarpong can be reached at 5712703438. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/IBRAHIM SIDDO/Primary Examiner, Art Unit 2681