DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
Claims 1 and 8 are amended. Claims 1-4, 6-11, and 13-14 are presented for examination.
Response to Arguments
Rejection under 35 U.S.C. 103
Applicant’s arguments (numbered 1 and 2 in applicant’s response) with respect to claim 1 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Applicant's arguments (numbered 3 in applicant’s response) with respect to claim 1 have been fully considered but they are not persuasive. Applicant argues “claim 1 recites using X axis (Conservative/Progress) and Y axis (Stable/Fun) as boundary data determination criteria to select the recited plurality of emotional models. In contrast, the difference in degree of positivity (X axis) and degree of excitability (Y axis) of Lee are merely for emotional classification.” However, Lee teaches using emotional classifications to determine different sound models for changing the emotion of a user. Specifically, Lee teaches “[0172] the sound that increases the emotion factor corresponding to the degree of positivity may be set when the vehicle 10 is designed, and may correspond to sound that has at least one of the size, the genre, the equalizer, the tone color, and the sound wave region of the sound in which the user can feel the positive emotion. For example, the sound that increases the emotion factor corresponding to the degree of positivity may include hip-hop music, classical music, and pop music in which the user can feel the positive emotion. However, the sound that increases the emotion factor corresponding to the degree of positivity may be set based on the user's input through the inputter 130. That is, the user may set the sound that caused the positive emotion to be the sound that increases the emotion factor corresponding to the degree of positivity, and the sound that is set can be stored in the storage 160.” Therefore, Lee uses an emotional axis to establish criteria for classifying emotions of a user, and this emotion classification is later used to determine sound models that can improve the positivity of a user’s emotions.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-2, 6-9, and 13-14 are rejected under 35 U.S.C. 103 as being unpatentable over Cella (US 20210356285 A1) in view of Sudo et al. (US 20220185178 A1), Lee et al. (US 20200215294 A1; hereinafter referred to as Lee), and Goran et al. (US 20160234595 A1; hereinafter referred to as Goran).
Regarding claim 1, Cella teaches: an emotion modeling method, comprising: receiving a sound uttered by a user ([0527] A voice-analysis module may take voice input);
determining an emotional attribute based on the sound using an emotion analysis algorithm ([0129] a voice-analysis circuit trained using machine learning that classifies an emotional state of the rider for the captured voice output of the rider);
deriving an instrumental value of a sound concept by analyzing psychosocial consequences of the emotional attribute ([0130] In embodiments, the expert system is trained to optimize the at least one operating parameter based on feedback of outcomes of the emotional states when adjusting the at least one operating parameter for a set of individuals. In embodiments, the emotional state of the rider is determined by a combination of the captured voice output of the rider and at least one other) using an artificial neural network ([0131] a first neural network 3122 trained to classify emotional states based on analysis of human voices detects an emotional state of a rider through recognition of aspects of the voice 31128 of the rider captured while the rider is occupying the vehicle 3110 that correlate to at least one emotional state 3166 of the rider);
generating a plurality of emotional models ([0131] a first neural network trained to classify emotional states based on analysis of human voices detects an emotional state of a rider through recognition of aspects of the voice of the rider captured while the rider is occupying the vehicle that correlate to at least one emotional state of the rider; and a second neural network that optimizes, for achieving a favorable emotional state of the rider, an operational parameter of the vehicle in response to the detected emotional state of the rider) based on the instrumental value of the sound concept… ([0375] Parameters 430 may include rider state parameters 437, such as parameters relating to comfort 439, emotional state, satisfaction, goals, type of trip, fatigue and the like),
and an adaptable sound model ([0153-0154] the second neural network optimizes the operational parameter in real time responsive to the detecting of an emotional state of the rider by the first neural network… the operational parameter that is optimized affects at least one of a route of the vehicle, in-vehicle audio content).
Cella does not explicitly, but Sudo discloses: wherein the plurality of emotional models comprise a cultured sound model ([0073] The server 2 has a database which stores a plurality of pseudo sound data sets in which the sound elements of the engine sound are associated with each of a plurality of predefined driving states. This can include the cultured and entertaining sound models.), an entertaining sound model… ([0069] the sound element selection unit 374 may change the standard of the sound element to be selected with respect to the detected driving state according to the setting. This allows a user to easily experience a high-speed driving sound);
selecting one of the plurality of emotional models based on a drive mode ([0057] Each pseudo sound data set is a data table in which sound elements are associated with each of a plurality of predefined driving states. The sound element is sound data associated with a vehicle model and a driving state), wherein the drive mode is one of an eco mode, a comfort mode, a sports mode, and a smart mode ([0065] The detection unit 35 is one or more sensors having a function of detecting the driving state of a passenger car equipped with the in-vehicle device. For example, the detection unit 35 is any one or more of a speed sensor which detects the speed of a passenger car as a driving state, an accelerometer which detects the acceleration of a passenger car as a driving state, an inclined angle sensor which detects the inclination of the road as the driving state of a passenger car, and the like. Different speeds, accelerations, and angles of the vehicle can represent different drive modes.) and wherein: when the drive mode is set to the comfort mode, the cultured sound model is selected; when the drive mode is set to the sports mode, the entertaining sound model is selected; when the drive mode is set to the smart mode, the adaptable sound model is selected… ([0069] The sound element selection unit 374 has a function of selecting a sound element according to the driving state detected by the detection unit 35 from the pseudo sound data set. The sound element selection unit 374 may change the standard for selecting the sound element according to the setting. For example, the sound element selection unit 374 may switch the sound element to be selected from the sound element of 50 km/h, the sound element of 75 km/h (corresponds to 1.5 times the detected speed), and the sound element of 100 km/h (corresponds to twice the detected speed) according to the setting when the speed of the passenger car is 50 km/h. As described above, the sound element selection unit 374 may change the standard of the sound element to be selected with respect to the detected driving state according to the setting. Different drive states can be mapped to different types of sounds.);
wherein, when the drive mode is set to the eco mode, the vehicle sound is turned off ([0072] The output of the virtual environmental sound is preferably turned on/off according to the user's setting);
Cella and Sudo are considered analogous in the field of audio processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Cella to combine the teachings of Sudo because doing so would allow for different types of audio to outputted depending on the drive mode of a vehicle, improving the immersive experience of a user while driving (Sudo [0072] output the virtual environmental sound from the sound output unit 36. Examples of the virtual environmental sound include wind noise, tire running sound, and brake noise. This makes it possible to play back a more natural driving sound).
The combination of Cella and Sudo does not explicitly, but Goran teaches: generating a vehicle sound corresponding to the selected emotional model ([0073] based on a determination of the user's current physical or emotional state, output recommendation engine 126 can cause music content, a feed of news articles that have been designated as being positive content, and/or a feed of image content that have been designed as being amusing content to be presented on user device 102 associated with the user. In another more particular example, output recommendation engine 126 can cause a sound effect (e.g., rain sounds) to be presented on a device having an audio output) using the selected emotional model ([0069] data processing engine 124 can generate one or more profiles that are indicative of the user's physical or emotional state over a given time period. For example, a baseline profile associated with a user can be generated based on data that is determined to be indicative of the user's physical or emotional state during a given time period, such as mornings, a given day, weekdays, weekends, a given week, a season, and/or any other suitable time period. In another example, data processing engine 124 can generate one or more profiles that are indicative of the user's physical or emotional state for a given context);
and outputting the generated vehicle sound through a speaker… ([0154] some implementations, the one or more devices can include fans 1210 and/or 1212, washer/dryer 1220, television 1240, stereo 1250, speakers).
Cella, Sudo, and Goran are considered analogous in the field of audio processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Cella and Sudo to combine the teachings of Goran because doing so would allow the generated sound to efficiently help a user with a specific activity or alter their mood, improving a user’s experience while in a vehicle (Goran [0074] the ambient noise can be altered to assist a user in performing a particular activity, to improve a user's mood and/or physical state, and/or for any other suitable reason).
The combination of Cella, Sudo, and Goran does not explicitly, but Lee teaches: wherein the generating of the emotional model comprises: establishing a criterion of determining borderline data by performing position calculation of conservative and progress and stability and fun using the instrumental value of the sound concept ([0192] the numerical value of the degree of negativity may correspond to a numerical value of the degree of positivity having a minus (−) value, and may correspond to a case where the emotional state is located on a second quadrant and a third quadrant on the emotion model 400. In the comparison between the numerical value of the degree of excitability and the numerical value of the degree of negativity, it is premised that the comparison is made based on the absolute value of each numerical value): and determining an emotion modeling methodology based on the criterion ([0133] The emotion model 400 may classify the emotions of the user on the basis of predetermined emotion axes. The emotion axes may be determined based on emotions measured from images of the user or from bio-signals of the user. For example, emotional axis 1 may be degrees of positivity or negativity, which are measurable by voices or facial expressions of the user, and emotional axis 2 may be degrees of excitability or activity, which are measurable by the GSR or the EEG);
and wherein establishing the criterion of determining borderline data comprises establishing the criterion for determining three methodologies of the emotion modeling using the position calculation for conservative and progress on an X-axis and stability and fun on a Y-axis ([0135] The emotion model may be a Russell's emotion model. The Russell's emotional model may be expressed by a two-dimensional graph based on the x-axis and the y-axis, and may classify emotions to eight areas of joy (0 degrees), excitement (45 degrees), arousal (90 degrees), pain (135 degrees), unpleasantness (180 degrees), depression (225 degrees), sleepiness (270 degrees), and relaxation (315 degrees). In addition, the eight areas may comprise a total of 28 emotions that are classified into similar emotions belonging to the eight areas. These emotions can include conservative and progress and stability and fun.) to be used in generating the plurality of emotional models ([0172] the sound that increases the emotion factor corresponding to the degree of positivity may be set when the vehicle 10 is designed, and may correspond to sound that has at least one of the size, the genre, the equalizer, the tone color, and the sound wave region of the sound in which the user can feel the positive emotion. For example, the sound that increases the emotion factor corresponding to the degree of positivity may include hip-hop music, classical music, and pop music in which the user can feel the positive emotion. However, the sound that increases the emotion factor corresponding to the degree of positivity may be set based on the user's input through the inputter 130. That is, the user may set the sound that caused the positive emotion to be the sound that increases the emotion factor corresponding to the degree of positivity, and the sound that is set can be stored in the storage 160).
Cella, Sudo, Goran, and Lee are considered analogous in the field of audio processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Cella, Sudo, and Goran to combine the teachings of Lee because doing so would allow for the emotional model to continuously update itself based on learning emotional states of the user and provide improved vehicle sounds for a better user experience (Lee [0163] the vehicle 10 may improve the inference result of the neural network by continuously updating the weight, bias and activation function included in the neural network based on the current emotional state, the target emotional state, and information characterizing an arrival time of the user's emotion according to the pattern of the feedback device 150 to the target emotion. That is, the vehicle 10 may store the determined pattern and the information characterizing the arrival time of the user's emotion to the target emotion whenever the vehicle 10 drives, and continuously update the stored neural network based on information characterizing the stored determined pattern and the information characterizing the arrival time).
Regarding claim 2, the combination of Cella, Sudo, Goran, and Lee teaches: the emotion modeling method of claim 1. Cella further teaches: wherein the determining of the emotional attribute comprises: classifying a user emotion included in the sound using an emotion classifier ([0129] a rider voice capture system deployed to capture voice output of a rider occupying a vehicle; a voice -analysis circuit trained using machine learning that classifies an emotional state of the rider for the captured voice output of the rider);
and converting the classified user emotion into a concrete attribute ([0527] For example, among many other indicators, where a voice of an individual indicates happiness, the expert system may select or recommend upbeat music to maintain that state. Where a voice indicates stress, the system may recommend or provide a control signal to change a planned route to one that is less stressful).
Regarding claim 6, the combination of Cella, Sudo, Goran, and Lee teaches: the emotion modeling method of claim 1. Cella further teaches: wherein the deriving of the instrumental value of the sound concept comprises: classifying vehicle environment development needs ([0129] an artificial intelligence system for voice processing to improve rider satisfaction in a transportation system, comprising: a rider voice capture system deployed to capture voice output of a rider occupying a vehicle; a voice -analysis circuit trained using machine learning that classifies an emotional state of the rider for the captured voice output of the rider; and an expert system trained using machine learning that optimizes at least one operating parameter of the vehicle to change the rider emotional state) using at least one of logistic regression (LR), a support vector machine (SVM), or a K-nearest neighbor (KNN) algorithm ([0148] In embodiments, the artificial intelligence system comprises a recurrent neural network that detects the emotional state of the rider. A neural network is made up of many logistic regressions, since each layer in a neural network has LR. Therefore, LR is inherent in a neural network.), and reflecting the psychosocial consequences in the classified vehicle environment development needs ([0139] using a neural network to determine at least one vehicle operating parameter that affects a state of a rider occupying the operating vehicle; and using an artificial intelligence-based system to optimize the at least one vehicle operating parameter SO that a result of the optimizing comprises an improvement in the state of the rider).
Regarding claim 7, the combination of Cella, Sudo, Goran, and Lee teaches: the emotion modeling method of claim 1. Cella further teaches: predicting a vehicle environment function ([0139] using a neural network to determine at least one vehicle operating parameter that affects a state of a rider occupying the operating vehicle; and using an artificial intelligence - based system to optimize the at least one vehicle operating parameter SO that a result of the optimizing comprises an improvement in the state of the rider).
Goran further teaches: using at least one of multiple linear regression (MLR) or support vector regression (SVR) to derive the instrumental value of the sound concept ([0066] a determination can be made using a classifier that can classify a portion of data as being relevant to a recommended action (e.g., data that can be used to determine the likelihood that the action may impact the user's emotional state, data that can be used to determine when the recommended action is to be executed, etc.). It should be noted that the classifier can be trained using any suitable machine learning algorithm, such as a support vector machine, a decision tree, a Bayesian model, etc.).
Cella, Sudo, Goran, and Lee are considered analogous in the field of audio processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Cella, Sudo, Goran, and Lee to further combine the teachings of Goran because doing so would allow for the use of different machine learning techniques to better analyze a sound concept (Goran[ 0146] In some implementations, any suitable techniques can be used to identify the one or more sounds and/or noises, such as machine learning (e.g., Bayesian statistics, a neural network, support vector machines, and/or any other suitable machine learning techniques)).
Regarding claim 8, Cella teaches: an emotion modeling apparatus, comprising: a detector configured to detect a sound ([0391] whereby an artificial intelligence/machine learning system may be trained on a training set of data that consists of tracking and recording sets of interactions of humans as the humans interact with a set of interfaces, such as graphical user interfaces (e.g., via interactions with mouse, trackpad, keyboard, touch screen, joystick, remote control devices); audio system interfaces (such as by microphones, smart speakers, voice response interfaces) uttered by a user; and a processor configured to ([0722] as used herein, the term system may define any combination of one or more computing devices, processors, modules, software, firmware, or circuits that operate either independently or in a distributed manner to perform one or more functions). The rest of the claim recites limitations similar to claim 1 and is rejected similarly.
Regarding claim 9, it recites similar limitations as claim 2 and therefore is rejected similarly.
Regarding claim 13, it recites similar limitations as claim 6 and therefore is rejected similarly.
Regarding claim 14, it recites similar limitations as claim 7 and therefore is rejected similarly.
Claims 3 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Cella in view Sudo, Goran, and Lee, as applied to claims 1-2, 6-9, and 13-14, and further in view of Chen et al. (US 11449744 B2; hereinafter referred to as Chen).
Regarding claim 3, the combination of Cella, Sudo, Goran, and Lee teaches: the emotion modeling method of claim 2. The combination of Cella, Sudo, Goran, and Lee does not explicitly, but Chen discloses: wherein the classifying of the user emotion comprises: classifying the user emotion using a conversational memory network (CMN) ([col 1, lines 39-41] The architecture described herein can use end-to-end memory networks to model knowledge carryover in multi-turn conversations).
Cella, Sudo, Goran, Lee, and Chen are considered analogous art in the field of audio processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Cella, Sudo, Goran, and Lee to combine the teachings of Chen because using a neural network with memory would allow for better contextual understanding of a user speech (Chen [col 1, lines 39-44] The architecture described herein can use end-to-end memory networks to model knowledge carryover in multiturn conversations, where inputs encoded with intents and slots can be stored as embeddings in memory and decoding can exploit latent contextual information from memory).
Regarding claim 10, it recites similar limitations as claim 3 and therefore is rejected similarly.
Claims 4 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Cella in view Sudo, Goran, and Lee, as applied to claims 1-2, 6-9, and 13-14, and further in view of Verbeke et al. (US20210304787 A1; hereinafter referred to as Verbeke).
Regarding claim 4, the combination of Cella, Sudo, Goran, and Lee teaches: the emotion modeling method of claim 2. The combination of Cella, Sudo, Goran, and Lee does not explicitly, but Verbeke discloses: wherein the converting of the classified user emotion into the concrete attribute comprises: matching a related keyword with the classified user emotion ([0045] VPA (virtual personal assistant) 118 analyzes input 122 and detects specific vocal! characteristics commonly associated with feelings of excitement and/or anxiety. As shown in FIG. 4B, VPA 118 then generates an output 132 that matches the detected level of excitement).
Cella, Sudo, Goran, Lee, and Verbeke are considered analogous art in the field of audio processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Cella, Sudo, Goran, and Lee to combine the teachings of Verbeke because doing so would allow for the system to perform a better analysis of a user's intentions while driving, thereby increasing user safety (Verbeke [0008] one technical advantage of the disclosed techniques relative to the prior artis that the disclosed techniques enable a VPA to more accurately determine one or more operations to perform on behalf of the user based on the emotional state of the user. Accordingly, when implemented within a vehicle, the disclosed VPA helps to prevent the user from diverting attention away from driving in order to interact with vehicle features, thereby increasing overall driving safety).
Regarding claim 11, it recites similar limitations as claim 4 and therefore is rejected similarly.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Nathan Tengbumroong whose telephone number is (703)756-1725. The examiner can normally be reached Monday - Friday, 11:30 am - 8:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hai Phan can be reached at 571-272-6338. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/NATHAN TENGBUMROONG/Examiner, Art Unit 2654
/HAI PHAN/Supervisory Patent Examiner, Art Unit 2654