DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Drawings
The drawings are objected to for the following reasons:
In Figure 1, reference numeral “138” is not described in the Specification. Applicants can overcome this objection by deleting reference numeral “138” from Figure 1, or by adding a description of reference numeral “138”, if this can be done without introducing new matter.
In Figure 2, blank boxes 202 to 224 should include the actual text of the corresponding method steps described at ¶[0052] - ¶[0065] of the Specification. Simply having blank boxes renders it more difficult for the public to quickly ascertain the nature of the invention.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office Action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, Applicants will be notified and informed of any required corrective action in the next Office Action. The objection to the drawings will not be held in abeyance.
Specification
The title of the invention is not descriptive. A new title is required that is clearly indicative of the invention to which the claims are directed.
The following title is suggested: Multi-Modal Sensor Input Within a Predetermined Amount of Time for Performing an Action in a Vehicle
The disclosure is objected to because of the following informalities:
In ¶[0058], “strategy as determined in step 208” should be “strategy as determined in step 210”. See Specification, ¶[0055] - ¶[0056], which describes selecting a strategy in Step 210 of Figure 2, but determining a context is performed in Step 208.
In ¶[0067], “are provided o the driver 301” should be “are provided to the driver 301”.
In ¶[0067], there appears to be an unmatched missing quotation mark after “to lower the navigation voice, rotate the knob counterclockwise now”.
In ¶[0068], “or indication it will terminate” should be “or indicate it will terminate”.
In ¶[0070], there appears to be an unmatched left parenthesis for “(e.g., first inputs 206(1)”.
In ¶[0071], “activation of a 126” should be “activation of a camera 126”.
Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 8 to 9, 17 to 18, and 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claims 8 and 17 set forth that an instruction informs the passenger to perform a particular gesture “unrelated to any input devices of the vehicle”, which limitation is indefinite under 35 U.S.C. §112(b). Generally, it does not make sense that a gesture is “unrelated to any input devices of the vehicle”. That is, a gesture is detected by a camera, which is “an input device of the vehicle”, so that a gesture detected by a camera is not “unrelated to any input devices of the vehicle.” The Specification, ¶[0010] and ¶[0020], does provide support for this limitation. However, “unrelated to any input device” could be referring to the instruction or to the gesture. Conceivably, an instruction might not expressly include any reference to an input device but the limitation could be interpreted as a gesture not being related to any input devices, too. It is suggested that this limitation be canceled because the limitation appears to be ambiguous and logically inconsistent with the embodiments described in the Specification.
Independent claim 20 sets forth a limitation of “wherein the vehicle action is different from what the input device is typically used for”, which is indefinite. The Specification, ¶[0023] and ¶[0063], does provide support for this limitation, but a scope of the limitation is unclear. Generally, if a vehicle action is to open and close a window and an input device is a camera that senses a gesture, then it is not clear if a camera is “typically used for” sensing a gesture to open a window. That is, a scope of what an input device is typically used for is ambiguous if the input device is not specified along with its conventional uses.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 3 to 6, 8, 10, 12 to 15, 17, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Schwarz (U.S. Patent Publication 2019/0237079) in view of Ito et al. (U.S. Patent No. 7,437,488).
Concerning independent claims 1, 10, and 20, Schwarz discloses a method and device for performing a multi-modal dialog in a motor vehicle, comprising:
“receiving, via one or more first sensors of a vehicle, a first input from a passenger of the vehicle pertaining to a request, the one or more sensors having a first modality” – carrying out a multimodal dialog includes capturing an input of a vehicle user for activating a voice dialog (¶[0010] - ¶[0011]); provision is made for activating a voice dialog that can comprise a first voice input of a vehicle user (“a first input from a passenger of the vehicle”) (¶[0019]); a multimodal dialog machine captures and processes sensor signals; sensors may comprise a microphone for capturing voice inputs (“one or more first sensors having a first modality”) (¶[0032]); at the start of the method, a first voice input of a vehicle user is captured; a voice input could be ‘Call Robert Meyer’ (“pertaining to a request”) (¶[0037]: Figure 1: Step 10); here, voice is “a first modality”, and a microphone for capturing voice is “one or more sensors having a first modality”;
“providing instructions to the passenger for providing an additional input pertaining to the request [within a predetermined amount of time], via a processor of the vehicle” – a method may provide for an input request to be respectively output in response to the first voice input of the vehicle user (“providing instructions to the passenger”); a vehicle therefore responds to input of the user and makes inquiries or request further input from the user (“for providing an additional input pertaining to the request”); the input request may be output by voice output or by display on a screen of the vehicle, in particular on a head-up display, on a combination instrument, and/or on a central screen arranged in the center console of the vehicle (¶[0023]); an input request is therefore output; the input request may comprise a list of all telephone numbers for Robert Meyer output on a head-up display and a voice output of content including a graphically highlighted main number, ‘Would you like to call Robert Meyer’s main number?’ (¶[0040]: Figure 1: Step 35); broadly, a vehicle user is “the passenger”;
“receiving, via one or more second sensors of the vehicle, a second input from the passenger pertaining to the request, in response to the instructions, [within the predetermined amount of time,] the one or more second sensors having a second modality that is different from the first modality” – carrying out a multimodal dialog includes capturing an input of a vehicle user for activating gesture recognition (¶[0010] - ¶[0012]); activating gesture recognition is carried out under the condition that a dialog has not been concluded (¶[0022]); provision is made for further input to be or comprise a gesture, wherein the input request is a request to select a suggested option, and wherein the gesture is a pointing gesture; preferably a pointing gesture is carried out with a finger; particularly preferably, the pointing gesture is a pointing gesture carried out in the direction of a screen, wherein the suggested option is displayed on the screen (¶[0026]); sensors may comprise a camera for capturing gestures of the vehicle user (“one or more second sensors of the vehicle”) (¶[0032]); a gesture recognition is now activated, and vehicle user can conduct the dialog in a multimodal manner; a further input of the vehicle user which is a gesture is captured (“receiving . . . a second input from the passenger pertaining to the request, in response to the instructions”) (¶[0041] - ¶[0042]: Figure 1: Steps 40 to 50); here, a gesture is “a second modality that is different from the first modality” of a voice input;
“interpreting the second input, via the processor” – a gesture is interpreted and carried out (¶[0043]: Figure 1: Step 60);
“performing a vehicle action corresponding to the request based on the interpreting of the second input, via the processor” – a gesture is interpreted and carried out; the telephone call is made (¶[0043]: Figure 1: Step 60).
Concerning independent claims 1, 10, and 20, Schwarz clearly discloses all of these limitations with the exception of providing instructions “within a predetermined amount of time” and receiving a second input “within the predetermined amount of time”. However, Schwarz, at ¶[0006], does briefly note that gesture control for a telephone call can be enabled only for a particular period after a telephone call has been received. This particular period for a gesture recognition to be enabled can be construed to correspond to “a predetermined amount of time”. Applicants’ claim limitations of “providing instructions to the passenger . . . within the predetermined period of time” can be construed to set a time limit on when these instructions are provided, and do not necessarily require that the content of the instruction includes any specific reference to a time period that a second input must be received. That is, instructions and a second input must be within “a predetermined amount of time”, but the content of the instructions does not necessarily have to specify that a gesture input must be received within a predetermined amount of time from a voice input.
Concerning independent claims 1, 10, and 20, Ito et al. teaches an interface for car-mounted devices that includes image input unit 11, voice input unit 12 for collecting an operator’s voice, hand detecting unit 13, voice recognizing unit 14, and control unit 15 for forming a control command. (Column 5, Line 63 to Column 6, Line 19: Figure 1) It is first determined whether voice recognizing unit 14 has input, and then a time-counting value of a timer which counts the waiting time is reset. Subsequently, it is determined whether the hand data is input from hand detecting unit 13. When no hand data is input, it is determined whether the waiting time, i.e., time-counting value of the time, after having confirmed the voice matching, is longer than a preset allowable time T1, e.g., 3 seconds. When the waiting time is shorter than the allowable time T1, the routine returns to S130. When the waiting time is longer than the allowable time T1, it is so determined that the operator has no intention of operating the car-mounted devices, and the routine returns to S110. When is it determined at S130 that the hand data is input from hand data detecting unit 13, reference is made to a device data table to specify the car-mounted device that is be controlled. Then a control command is formed for changing the operating state that is obtained over to another operating state, and the control command is transmitted to the device to be controlled. When the operator’s gesture is confirmed before the elapse of allowable time T1 after the operator’s uttering of a demonstrable pronoun, the car-mounted devices corresponding to the gesture is specified as the devices to be controlled, whereby a control command is formed and transmitted for changing over the operating state of the devices to be controlled. (Column 8, Line 49 to Column 9, Line 25: Figure 4) Ito et al., then, teaches that a gesture (“a second input”) must be received “within the predetermined amount of time”. An objective is to reliably operate a plurality of car-mounted devices without a need of learning complex gestures or many reserved words. (Column 9, Lines 42 to 44) It would have been obvious to one having ordinary skill in the art to set a predetermined amount of time as taught by Ito et al. for a gesture to be received after an instruction is provided in Schwarz for a purpose of reliably operating a plurality of car-mounted devices without a need of learning complex gestures or many reserved words.
Concerning claims 3 and 12, Schwarz discloses that sensors may comprise a microphone for capturing voice inputs (¶[0032]); a voice input could be ‘Call Robert Meyer’ (¶[0037]: Figure 1: Step 10); here, ‘Call Robert Meyer’ is “a speech command”. Similarly, Ito et al. teaches a microphone 12 and control commands. (Figure 4)
Concerning claims 4 to 5 and 13 to 14, Schwarz discloses that an input request may be output by voice output or by display on a screen of the vehicle, in particular on a head-up display, on a combination instrument, and/or on a central screen arranged in the center console of the vehicle (¶[0023]); output devices may comprise a loudspeaker and/or a screen for outputting input requests (¶[0032]). Here, a loudspeaker provides “audio instructions that are provided via a speaker of the vehicle” and “a speaker that is configured to provide the instructions”. Similarly, a head-up display screen provides “visual instructions that are provided via a display screen of the vehicle” and “a display screen that is configured to provide the instructions.”
Concerning claims 6 and 15, Schwarz discloses that an input request is a request to select a suggested option, and a gesture is a pointing gesture carried out with a finger. Particularly preferably, a pointing gesture is a pointing gesture carried out in the direction of a screen, wherein the suggested option is displayed on the screen. If the suggested option is displayed on a screen, provision may be made for the vehicle user to have to point the extended finger in the direction of the screen, and it may be additionally or alternatively required for the user to move his/her finger forward and/or forward and backward in the pointing direction, so that a user then carries out in the air a gesture imitating the actuating of a conventional pushbutton. A gesture may comprise a movement carried out in a substantially horizontal manner with a hand or finger, which can be referred to as a ‘swiping gesture’. (¶[0025] - ¶[0029]) Here, pointing a finger in a direction of a screen in a manner forward and backward, or in a ‘swiping gesture’, is “engage a particular input device in a particular directional manner”. Implicitly, a screen is close to a user so that it is “in part on a proximity of the passenger to the particular input device”, and that “the second input” of a gesture “is received via one or more input sensors as to engagement of the particular input device in the particular directional manner”. Ito et al. teaches that there is a predetermined amount of time in which to receive a second input after an instruction of a input request is provided by Schwarz. That is, Schwarz provides an input request corresponding to “an instruction” that implicitly asks for user input in a particular manner known to a user.
Concerning claims 8 and 17, Schwarz discloses that an input request is a request to select a suggested option, and a gesture is a pointing gesture carried out with a finger (“the instructions inform the passenger to perform a particular gesture”). Particularly preferably, a pointing gesture is a pointing gesture carried out in the direction of a screen, wherein the suggested option is displayed on the screen. (¶[0025] - ¶[0029]) Sensors may comprise a camera for capturing gestures (“the second input is received via one or more cameras as to the particular gesture”). (¶[0032]) Ito et al. teaches that there is “a predetermined amount of time” in which an “instruction” of a input request is provided by Schwarz.
Concerning independent claim 20, Schwarz additionally includes the limitations of:
“a body; a microphone disposed within the body, the microphone configured to receive a first input from a passenger of the vehicle pertaining to a request of the passenger, the first input comprising a verbal command of the passenger” – provision is made for activating a voice dialog that can comprise a first voice input of a vehicle user (“a verbal command of the passenger”) (¶[0019]); a multimodal dialog machine captures and processes sensor signals; sensors may comprise a microphone for capturing voice inputs (“a microphone configured to receive a first input from a passenger of the vehicle pertaining to a request of the passenger, the first input comprising a verbal command of the passenger”) (¶[0032]); at the start of the method, a first voice input of a vehicle user is captured; a voice input could be ‘Call Robert Meyer’ (“pertaining to a request”) (¶[0037]: Figure 1: Step 10); implicitly, a vehicle includes “a body”, and a microphone incorporated within a vehicle is “disposed within the body”;
“the second input received with an input device that is engaged by the passenger . . . wherein the vehicle action is different than what the input device is typically used for” – an embodiment enables a user to confirm a suggested option by means of a pointing gesture; if a suggested option is displayed on a screen, provision is made for a vehicle user to have to point the extended finger in the direction of the screen; the user carries out in the air a gesture for imitating the actuation of a conventional pushbutton (¶[0025] - ¶[0026]); here, a user is engaging a pointing gesture with a screen so that a screen can be construed as “an input device”; a screen is conventionally used for displaying information and is not conventionally used for placing telephone calls (“wherein the vehicle action is different than what the input device is typically used for”).
Claim 2 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Schwarz (U.S. Patent Publication 2019/0237079) in view of Ito et al. (U.S. Patent No. 7,437,488) as applied to claims 1 and 10 above, and further in view of Doshi et al. (U.S. Patent Publication 2019/0318759).
Ito et al. teaches “a predetermined amount of time” for receiving a gesture after receiving voice to perform a control command, but does not provide “the predetermined amount of time is determined via the processor based on a prior history via adaptive learning.” However, Doshi et al. teaches speech recognition to enable a user to control electronic devices via a user’s voice command. (¶[0002]) A timeout threshold may be set to a predetermined value or may be dynamically adjusted based on a context of a user’s voice command. Specifically, a timeout threshold may be dynamically adjusted based on multiple factors that include historical data derived from the user’s prior data. (¶[0040] - ¶[0041]) The timeout threshold may be pre-determined or may be dynamically adjusted. The timeout threshold may be adjusted based on multiple factors including historical data derived from the user’s prior data. (¶[0065]: Figure 7) An objective is to improve a user’s experience in one or more sound recognition applications. (¶[0001]) It would have been obvious to one having ordinary skill in the art to set an amount of time for receiving a gesture after receiving voice to perform a control command of Ito et al. that is adapted according to historical data of a user as taught by Doshi et al. for a purpose of improving a user’s experience in sound recognition applications.
Claims 7, 9, 16, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Schwarz (U.S. Patent Publication 2019/0237079) in view of Ito et al. (U.S. Patent No. 7,437,488) as applied to claims 1, 3, 6, 8, 10, 12, 15, and 17 above, and further in view of Holmgren et al. (U.S. Patent Publication 2022/0035477).
Schwarz does not disclose the limitations of “the instructions inform the passenger to engage the particular input device that is usually used for a first vehicle function” and “the second input is received via the one or more input sensors as to engagement of the input device for executing the request with respect to a second vehicle function that is different from and unrelated to the first vehicle function” of claims 7 and 16. Schwarz discloses detecting gestures with a camera and that the gestures can include a ‘swipe gesture’. (¶[0029] and ¶[0032]) Moreover, Schwarz does not disclose the limitations of “the instructions inform the user to swipe a steering wheel of the vehicle via a hand or finger of the passenger” and “the second input is received . . . as to the swiping of the steering wheel of the vehicle via the hand or finger of the passenger” of claims 9 and 18. However, Holmgren et al. teaches a motorist vehicle interface sensor of a steering wheel that includes optoelectronic components. (Abstract) An upper segment of the steering wheel has two concentric bands of proximity sensors. The steering wheel’s concentric bands of proximity sensors identify swiping gestures across the width of the steering wheel grip. A thumb swipe gesture is detected by a steering wheel in accordance with a swipe gesture performed by the driver’s thumb of a hand gripping the steering wheel. (¶[0033] - ¶[0034]: Figures 3 to 5) Holmgren et al., then, teaches detecting that a passenger can “swipe a steering wheel of the vehicle via a hand or finger of the passenger” and that “the second input is received . . . as to the swiping of the steering wheel of the vehicle via the hand or finger of the passenger” of claims 9 and 18. Here, a steering wheel is “the particular input device that is usually used for a first vehicle function” of steering the vehicle, but is “for executing the request with respect to a second vehicle function that is different from and unrelated to the first vehicle function” of claims 7 and 16. That is, Schwarz discloses a swipe gesture for controlling a function of selecting a telephone number for a telephone call that is “a second vehicle function that is different from and unrelated to the first function” of steering a vehicle. Holmgren et al. teaches an objective of enabling a driver to keep his hands on the steering wheel and eyes on the road while operating electronic devices and automated features in a vehicle. (¶[0032]) It would have been obvious to one having ordinary skill in the art to provide an input request for a swipe gesture of Schwarz to swipe a steering wheel of a vehicle to control a vehicle function that is different from a steering function as taught by Holmgren et al. for a purpose of enabling a driver to keep his hands on the steering wheel and eyes on the road while operating electronic devices and automated features in a vehicle.
Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Schwarz (U.S. Patent Publication 2019/0237079) in view of Ito et al. (U.S. Patent No. 7,437,488) as applied to claim 10 above, and further in view of Neff (U.S. Patent Publication 2017/0169823).
Schwarz discloses controlling a plurality of functions on a vehicle including selecting a telephone number of a telephone call, setting navigation destinations, and setting radio stations with multimodal input. (¶[0004]) However, Schwarz does not provide “a plurality of different vehicle actions, including opening and closing windows, adjusting distance threshold for cruise control, adjusting volume for sound for a navigation system of the vehicle, and adjusting zoom of a display of the navigation system.” However, Neff teaches voice control of a motor vehicle that includes various voice commands including ‘Activate cruise control at the current speed’, ‘Turn high beams on’, ‘Increase the volume of the music output’, ‘Activate heated seats’, ‘Close all windows’, ‘Close the sunroof’, ‘Turn off the radio’, and ‘Lock the car’. (¶[0015] - ¶[0019] and ¶[0084]) Neff, then, teaches “a plurality of different vehicle actions: including at least “opening and closing windows”. An objective is to provide an improved scope of commands and improved flexibility for voice control of a motor vehicles that can be processed locally and at a server external to the vehicle. (¶[0005] - ¶[0007]) It would have been obvious to one having ordinary skill in the art to provide a plurality of different vehicle actions in Schwarz to include opening and closing of windows as taught by Neff for a purpose of providing improved scope and flexibility for voice control of motor vehicles processed locally and at an external server.
Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicants’ disclosure.
Yokoyama et al., Weng et al., and Parekh et al. disclose related prior art.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARTIN LERNER whose telephone number is (571) 272-7608. The examiner can normally be reached Monday-Thursday 8:30 AM-6:00 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached at (571) 272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MARTIN LERNER/Primary Examiner
Art Unit 2658 March 26, 2026