DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Specification
Applicant is reminded of the proper language and format for an abstract of the disclosure.
The abstract should be in narrative form and generally limited to a single paragraph on a separate sheet within the range of 50 to 150 words in length. The abstract should describe the disclosure sufficiently to assist readers in deciding whether there is a need for consulting the full patent text for details.
The language should be clear and concise and should not repeat information given in the title. It should avoid using phrases which can be implied, such as, “The disclosure concerns,” “The disclosure defined by this invention,” “The disclosure describes,” etc. In addition, the form and legal phraseology often used in patent claims, such as “means” and “said,” should be avoided.
The abstract of the disclosure is objected to because the abstract contains at least one of the phrases that can be implied, such as the phrase “aspects of the disclosure relate to microelectromechanical systems (MEMS)”. Correction is required. See MPEP § 608.01(b).
The disclosure is objected to because of the following informalities: the title is not descriptive. A new title that would include the inventive features of the claimed invention is respectfully requested.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as failing to set forth the subject matter which the inventor or a joint inventor, or for applications subject to pre-AIA 35 U.S.C. 112, the applicant regards as the invention.
Regarding claims 1, 13, and 17, the claims recite “one or more processors configured to: obtain the audio signal, wherein the audio signal is generated based on detection of sound by a microphone; obtain the motion signal, wherein the motion signal is generated based on detection of motion by a motion sensor mounted on a surface of an object” (claim 1) or similarly “one or more processors configured to: obtain the audio signal based on detection of sound by a microphone; obtain the motion signal based on detection of motion by a motion sensor mounted on a surface of an object” (claims 13 and 17) without describing how the microphone and the processor are arranged relative to one another and to the motion sensor and the surface of the object. In this case, the claims appear to only define the motion sensor mounted on “a surface of an object” but do not disclose any essential structural cooperative relationship between the microphone and the object and/or the microphone and the motion sensor. The claims are incomplete for omitting essential structural cooperative relationships of elements, such omission amounting to a gap between the necessary structural connections (see MPEP § 2172.01). Without the clearly defined structural cooperative relationship between the elements mentioned above, there is a disconnection between the motion detection by the motion detector, being mounted on the object, and the audio detection, not disclosed as being mounted to anything. Further clarification is respectfully requested.
Regarding claim 1, the claim recites to “determine a context of a contact type of the surface of the object based on the similarity measure” but does not define the context or types of contacts associated with the levels or ranges of the similarity measure. In this case, the claim does not explain whether the context of the contact would be determined based on a higher similarity measure of the signals or based on a lower similarity measure of the signals. Further clarification is respectfully requested.
Regarding claim 11, the claim recites “to determine the context of the contact type of the surface of the object includes comparison of a machine learning engine output to past context types of contacts determined by a machine learning engine” but does not disclose how the contact type is determined based on the comparison of two different machine learning engines. For examination purposes, this limitation will be interpreted as the context being determined based on an algorithm.
Regarding claim 13, the claim recites to “classify a contact type associated with a contact on the surface of the object based on the comparison data” but does not define the type of contact or classification of contact associated with the levels or ranges of the comparison data. In this case, the claim does not define which contact classification is associated with which comparison data/results. Further clarification is respectfully requested.
Regarding claim 16, the claim recites “to implement a machine learning engine trained to select the contact type from a plurality of contact types using the comparison data” but does not disclose the criteria for selecting the contact type. For examination purposes, this limitation will be interpreted as any arbitrarily selected type of contact. Further clarification is respectfully requested.
Regarding claim 17, the claim recites to “select a classification based on the joint correlation data” but does not define the selected classification associated with different levels or ranges of the joint correlation data. In this case, the claim does not define the relationship between the classification selection criteria and the joint correlation data. Further clarification is respectfully requested.
Claims 2-10, 12, 14-15, 18-20 are rejected as being dependent on the rejected base claim.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-19 are rejected under 35 U.S.C. 103 as being unpatentable over Wiesbauer et al. (Pat. No. US 9,945,746) (hereafter Wiesbauer) in view of Littrell et al. (Pat. No. US 12,216,746) (hereafter Littrell).
Regarding claim 1, Wiesbauer teaches a device comprising:
one or more processors configured to:
obtain the audio signal (i.e., MEMS microphones 342 and 344 receive pressure waves from an ambient environment and transduce the pressure waves to electrical signals. The electrical pressure signals are processed by audio processing circuit 348 including a filter and other audio processing circuits commonly used for microphone signal processing and then supplying audio signals to speaker or another component configured to receive audio signals) (see Column 8, line 21, to Column 10, line 64), wherein the audio signal is generated based on detection of sound (i.e., MEMS microphones 342 and 344 receive pressure waves from an ambient environment and transduce the pressure waves to electrical signals) (see Column 8, line 21, to Column 10, line 64) by a microphone (i.e., MEMS microphones 342 and 344) (see Column 8, line 21, to Column 10, line 64);
obtain the motion signal (i.e., MEMS microphones 342 and 344 receive pressure waves from an ambient environment and transduce the pressure waves to electrical signals. The electrical pressure signals are processed by motion processing circuit 350) (see Column 8, line 21, to Column 10, line 64), wherein the motion signal is generated based on detection of motion (i.e., MEMS microphones 342 and 344 receive pressure waves from an ambient environment and transduce the pressure waves to electrical signals) (see Column 8, line 21, to Column 10, line 64) by a motion sensor (i.e., MEMS microphones 342 and 344, wherein each pressure sensor of the plurality of pressure sensors may be one of a group consisting of a static pressure sensor, a dynamic pressure sensor, and a microelectromechanical (MEMS) microphone) (see Column 8, line 21, to Column 10, line 64) mounted on a surface of an object (i.e., pressure sensor may be coupled to a structure that exhibits a reproducible motion and detecting the motion may include detecting the reproducible motion) (see Column 8, line 21, to Column 10, line 64);
perform a similarity measure based on the audio signal and the motion signal (i.e., comparing the first signal and the second signal may include determining a difference signal between the first signal and the second signal) (see Column 8, line 21, to Column 10, line 64); and
determine a context of a contact type of the surface of the object (i.e., such pressure signals from multiple sensors may be used to detect movements in any direction and the combinations of pressure signals may be used to differentiate between ambient pressure changes, sounds, and motion. For example, pressure signals detected simultaneously at three sensors may correspond to an ambient air pressure change or sound pressure signal) (see Column 3, line 1, to Column 4, line 28) based on the similarity measure (i.e., characterizing a motion based on the comparing) (see Column 8, line 21, to Column 10, line 64); but does not explicitly teach a memory configured to store an audio signal and a motion signal.
Regarding the memory, Littrell teaches a memory (i.e., controller 108 can include one or more storage components (e.g., volatile memory, non-volatile memory, a hard drive, or combinations of them, among others)) (see Column 4, line 24, to Column 5, line 37) configured to store an audio signal (i.e., signals 105 produced by the acoustic transducer 104) (see Column 4, line 24, to Column 5, line 37) and a motion signal (i.e., signals 107 produced by the accelerometer 106) (see Column 4, line 24, to Column 5, line 37); and
one or more processors (i.e., controller 108 can include one or more processing components (e.g., a central processing unit (CPU), an application specific integrated circuit (ASIC), a logic circuit, or combinations of them, among others)) (see Column 4, line 24, to Column 5, line 37) configured to determine a context of a contact type of the surface of the object based on the similarity measure (i.e., controller 108 determines a relationship between the signals 105 and 107 and is configured to capture characteristics of the user's voice from a range of contact levels that the device 200 may have with the user's head) (see Column 4, line 24, to Column 5, line 37). In view of the teaching of Littrell, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have added a memory in order to store and retrieve the measured signals and algorithms for evaluating motion information.
Regarding claim 2, Wiesbauer teaches that the one or more processors are configured to perform the similarity measure based on a first comparison between a representation of the audio signal and a representation of the motion signal (i.e., comparing the first signal and the second signal may include determining a difference signal between the first signal and the second signal) (see Column 8, line 21, to Column 10, line 64).
Regarding claim 3, Wiesbauer teaches that the first comparison is a difference of the representation of the audio signal and the representation of the motion signal (i.e., comparing the first signal and the second signal may include determining a difference signal between the first signal and the second signal) (see Column 8, line 21, to Column 10, line 64).
Regarding claim 4, Wiesbauer teaches that the first comparison is a ratio of the representation of the audio signal and the representation of the motion signal (i.e., signal processor 118 receives the difference signals and may perform further calculation to generate a motion value corresponding to velocity direction. The motion value may also include velocity magnitude, dependent on signal processor 118. Signal processor 118 may perform a comparison or difference operation. In such embodiments, comparison circuit 116 may be omitted. In various other embodiments, comparison circuit 116 or signal processor 118 may implement further, more advanced, algorithms to evaluate motion information) (see Column 4, lines 29-61).
Regarding claim 5, Wiesbauer teaches that the representation of the audio signal is a first correlation (i.e., MEMS microphones 342 and 344 receive pressure waves from an ambient environment and transduce the pressure waves to electrical signals. The electrical pressure signals are processed by audio processing circuit 348 including a filter and other audio processing circuits commonly used for microphone signal processing and then supplying audio signals to speaker or another component configured to receive audio signals) (see Column 8, line 21, to Column 10, line 64) and the representation of the motion signal is a second correlation (i.e., motion processing circuit 350 may include filtering circuits, comparison circuits, and a signal processor for generating motion value 354 related to a determined motion based on pressure signals received at MEMS microphones 342 and 344) (see Column 8, line 21, to Column 10, line 64).
Regarding claim 6, Wiesbauer teaches that the representation of the audio signal is based on a rectification of the audio signal as obtained by the one or more processors (i.e., MEMS microphones 342 and 344 receive pressure waves from an ambient environment and transduce the pressure waves to electrical signals. The electrical pressure signals are processed by audio processing circuit 348 including a filter and other audio processing circuits commonly used for microphone signal processing and then supplying audio signals to speaker or another component configured to receive audio signals) (see Column 8, line 21, to Column 10, line 64).
Regarding claim 7, Wiesbauer as modified by Littrell as disclosed above does not directly or implicitly teach that the first comparison between the representation of the audio signal and the representation of the motion signal is based on: a second comparison of the representation of the audio signal to an audio threshold; and a third comparison of the representation of the motion signal to a motion threshold. However, Littrell teaches that the first comparison between the representation of the audio signal and the representation of the motion signal is based on: a second comparison of the representation of the audio signal to an audio threshold; and a third comparison of the representation of the motion signal to a motion threshold (i.e., comparing characteristics of the acceleration signal and/or the acoustic signal with the stored biometric characteristics includes comparing characteristics of the accelerometer signal with characteristics of a previously captured voice accelerometer signal, comparing characteristics the acoustic signal with characteristics of a previously captured acoustic transducer signal, or comparing a relationship between the accelerometer signal and the acoustic signal with a relationship between the previously captured voice accelerometer signal and the previously captured acoustic transducer signal, or combinations of them) (see Column 8, line 1, to Column 9, line 31). In view of the teaching of Littrell, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have performed additional comparison in order to further improve the accuracy of the motion detection.
Regarding claim 8, Wiesbauer teaches to determine the context of the contact type of the surface of the object includes classifying the contact type based on a combination of the representation of the audio signal and the representation of the motion signal (i.e., characterizing a motion based on the comparing) (see Column 8, line 21, to Column 10, line 64).
Regarding claim 9, Wiesbauer teaches to determine the context of the contact type of the surface of the object includes classifying the contact type based on a magnitude of contact (i.e., as the device moves up or down, which in this case correspond to motions aligned with the sensor producing pressure signal 101, a detectable signal peak is generated. When the device moves right or left, which corresponds to motions orthogonal to the sensor producing pressure signal 101, little or no signal is generated. In various embodiments, such pressure signals from multiple sensors may be used to detect movements in any direction. In some embodiments, the combinations of pressure signals may be used to differentiate between ambient pressure changes, sounds, and motion. For example, pressure signals detected simultaneously at three sensors may correspond to an ambient air pressure change or sound pressure signal) (see Column 4, lines 11-28).
Regarding claim 10, Wiesbauer as modified by Littrell as disclosed above does not directly or implicitly teach that the context of the contact type of the surface of the object includes at least one of: a scratch, a dent, touch, a non-contact touch, damage, hard touch. However, Littrell teaches that the context of the contact type of the surface of the object includes at least one of: touch, a non-contact touch (i.e., since the voice accelerometer 106 may not contact the user's head in the same way every time, the controller 108 can prompt the user to remove and reinstall the device 200 during repetitions of the enrollment process in order to capture characteristics of the user's voice from a range of contact levels that the device 200 may have with the user's head) (see Column 5, lines 5-37). In view of the teaching of Littrell, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have established a relationship between the audio signal and the motion signal in order to further improve the motion detection accuracy.
Regarding claim 11, Wiesbauer as modified by Littrell as disclosed above does not directly or implicitly teach that to determine the context of the contact type of the surface of the object includes comparison of a machine learning engine output to past context types of contacts determined by a machine learning engine. However, Littrell teaches that to determine the context of the contact type of the surface of the object includes comparison of a machine learning engine output to past context types of contacts determined by a machine learning engine (i.e., the controller 108 can determine whether the characteristics of the live signal correspond to the stored biometric characteristics for the user within a threshold level of similarity. In some examples, the controller 108 can apply a Gaussian mixture model or another probabilistic model, a neural network or another machine learning model, or combinations of them, among others to the live and stored characteristics to determine whether they sufficiently correspond to one another. As another example, the controller 108 can compute a cross-correlation (e.g., sliding dot product) of the live and stored characteristics (which can be represented as deterministic signals), calculate an error (e.g., mean-squared error) between the live and stored characteristics, or perform another similarity analysis to determine a similarity between the live and stored characteristics. The controller 108 can use similar techniques to compare 506 characteristics of the acceleration signal 107 with the stored acceleration signal characteristics for the user, and to compare 508 a relationship between the signals 105, 107 and the previously captured relationship. By comparing three sets of characteristics, it becomes much more difficult for one user to imitate or be confused with another user relative to techniques using only one or two of these comparisons) (see Column 5, line 66, to Column 7, line 51). In view of the teaching of Littrell, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have used a machine learning module in order to further increase the motion detection accuracy by improving the detection and comparison accuracy.
Regarding claim 12, Wiesbauer as modified by Littrell as disclosed above does not directly or implicitly teach that the machine learning engine is one of: a decision tree, support vector machine, or neural network. However, Littrell teaches that the machine learning engine is one of: a decision tree, support vector machine, or neural network (i.e., a detection scheme (e.g., thresholds, a Gaussian mixture model or another probabilistic model, a neural network or another machine learning model, or combinations of them, among others)) (see Column 5, line 66, to Column 7, line 51). In view of the teaching of Littrell, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have used a machine learning module, such as a neural network, in order to further increase the motion detection accuracy by improving the detection and comparison accuracy.
Regarding claim 13, Wiesbauer teaches a device comprising:
one or more processors configured to:
obtain the audio signal (i.e., MEMS microphones 342 and 344 receive pressure waves from an ambient environment and transduce the pressure waves to electrical signals. The electrical pressure signals are processed by audio processing circuit 348 including a filter and other audio processing circuits commonly used for microphone signal processing and then supplying audio signals to speaker or another component configured to receive audio signals) (see Column 8, line 21, to Column 10, line 64) based on detection of sound (i.e., MEMS microphones 342 and 344 receive pressure waves from an ambient environment and transduce the pressure waves to electrical signals) (see Column 8, line 21, to Column 10, line 64) by a microphone (i.e., MEMS microphones 342 and 344) (see Column 8, line 21, to Column 10, line 64);
obtain the motion signal (i.e., MEMS microphones 342 and 344 receive pressure waves from an ambient environment and transduce the pressure waves to electrical signals. The electrical pressure signals are processed by motion processing circuit 350) (see Column 8, line 21, to Column 10, line 64) based on detection of motion (i.e., MEMS microphones 342 and 344 receive pressure waves from an ambient environment and transduce the pressure waves to electrical signals) (see Column 8, line 21, to Column 10, line 64) by a motion sensor (i.e., MEMS microphones 342 and 344, wherein each pressure sensor of the plurality of pressure sensors may be one of a group consisting of a static pressure sensor, a dynamic pressure sensor, and a microelectromechanical (MEMS) microphone) (see Column 8, line 21, to Column 10, line 64) mounted on a surface of an object (i.e., pressure sensor may be coupled to a structure that exhibits a reproducible motion and detecting the motion may include detecting the reproducible motion) (see Column 8, line 21, to Column 10, line 64);
perform one or more comparisons of the audio signal and the motion signal to generate comparison data (i.e., comparing the first signal and the second signal may include determining a difference signal between the first signal and the second signal) (see Column 8, line 21, to Column 10, line 64); and
classify a contact type associated with a contact on the surface of the object (i.e., such pressure signals from multiple sensors may be used to detect movements in any direction and the combinations of pressure signals may be used to differentiate between ambient pressure changes, sounds, and motion. For example, pressure signals detected simultaneously at three sensors may correspond to an ambient air pressure change or sound pressure signal) (see Column 3, line 1, to Column 4, line 28) based on the comparison data (i.e., characterizing a motion based on the comparing) (see Column 8, line 21, to Column 10, line 64); but does not explicitly teach a memory configured to store an audio signal and a motion signal, and that the one or more processors is configured to quantify frequency characteristics of the audio signal and the motion signal and quantify amplitude characteristics of the audio signal and the motion signal.
Regarding the memory, Littrell teaches a memory (i.e., controller 108 can include one or more storage components (e.g., volatile memory, non-volatile memory, a hard drive, or combinations of them, among others)) (see Column 4, line 24, to Column 5, line 37) configured to store an audio signal (i.e., signals 105 produced by the acoustic transducer 104) (see Column 4, line 24, to Column 5, line 37) and a motion signal (i.e., signals 107 produced by the accelerometer 106) (see Column 4, line 24, to Column 5, line 37); and
one or more processors (i.e., controller 108 can include one or more processing components (e.g., a central processing unit (CPU), an application specific integrated circuit (ASIC), a logic circuit, or combinations of them, among others)) (see Column 4, line 24, to Column 5, line 37) configured to quantify frequency characteristics of the audio signal (i.e., a liveness detection algorithm, such as an algorithm that extracts features from one or both of the signals 105, 107 and uses mixture models, neural network models, or other techniques to identify artifacts in the signals) (see Column 4, line 26, to Column 7, line 52) and the motion signal (i.e., the controller 108 can apply a voice activity detection algorithm to the signal 107 to detect voice activity by the user of the device 200. Such an algorithm can first extract features from the signal 107, such as number of zero crossings, relative amplitude levels in different frequency bands, changes in levels over time, energy, power, signal-to-noise ratio, pitch, or combinations of them, among others) (see Column 4, line 26, to Column 7, line 52);
quantify amplitude characteristics of the audio signal (i.e., a liveness detection algorithm, such as an algorithm that extracts features from one or both of the signals 105, 107 and uses mixture models, neural network models, or other techniques to identify artifacts in the signals) (see Column 4, line 26, to Column 7, line 52) and the motion signal (i.e., the controller 108 can apply a voice activity detection algorithm to the signal 107 to detect voice activity by the user of the device 200. Such an algorithm can first extract features from the signal 107, such as number of zero crossings, relative amplitude levels in different frequency bands, changes in levels over time, energy, power, signal-to-noise ratio, pitch, or combinations of them, among others) (see Column 4, line 26, to Column 7, line 52); and
classify a contact type associated with a contact on the surface of the object based on the comparison data (i.e., if the characteristics of the signal 105 are sufficiently similar to the stored characteristics, but the analysis of the signal 107 indicates an absence of voice activity by the user, then the controller 108 can determine that the sensed acoustic signal is from a recording) (see Column 4, line 26, to Column 7, line 52). In view of the teaching of Littrell, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have added a memory in order to store and retrieve the measured signals and algorithms for evaluating motion information.
Regarding claim 14, Wiesbauer as modified by Littrell as disclosed above does not directly or implicitly teach that the memory is configured to store relative position information for the microphone and the motion sensor in the memory, wherein the one or more comparisons of the audio signal and the motion signal use the relative position information to generate the comparison data. However, Littrell teaches that the memory is configured to store relative position information for the microphone and the motion sensor in the memory, wherein the one or more comparisons of the audio signal and the motion signal use the relative position information to generate the comparison data (i.e., the stored biometric characteristics include characteristics of signals previously captured by the voice accelerometer and/or the acoustic transducer at different positions of the device relative to the user) (see Column 8, lines 1-61). In view of the teaching of Littrell, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have added a memory in order to store and retrieve the measured signals and algorithms for evaluating motion information.
Regarding claim 15, Wiesbauer as modified by Littrell as disclosed above does not directly or implicitly teach that the memory is further configured to store a plurality of audio signals from a plurality of microphones including the microphone; wherein the relative position information further comprises relative positions for the plurality of microphones; and wherein the comparison data is further generated using the plurality of audio signals and the relative position information for the plurality of microphones. However, Littrell teaches that the memory is further configured to store a plurality of audio signals from a plurality of microphones including the microphone; wherein the relative position information further comprises relative positions for the plurality of microphones; and wherein the comparison data is further generated using the plurality of audio signals and the relative position information for the plurality of microphones (i.e., the stored biometric characteristics include characteristics of signals previously captured by the voice accelerometer and/or the acoustic transducer at different positions of the device relative to the user) (see Column 8, lines 1-61). In view of the teaching of Littrell, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have added a memory in order to store and retrieve the measured signals and algorithms for evaluating motion information.
Regarding claim 16, Wiesbauer as modified by Littrell as disclosed above does not directly or implicitly teach that the one or more processors are configured to implement a machine learning engine trained to select the contact type from a plurality of contact types using the comparison data. However, Littrell teaches that the one or more processors are configured to implement a machine learning engine trained to select the contact type from a plurality of contact types using the comparison data (i.e., the controller 108 can determine whether the characteristics of the live signal correspond to the stored biometric characteristics for the user within a threshold level of similarity. In some examples, the controller 108 can apply a Gaussian mixture model or another probabilistic model, a neural network or another machine learning model, or combinations of them, among others to the live and stored characteristics to determine whether they sufficiently correspond to one another. As another example, the controller 108 can compute a cross-correlation (e.g., sliding dot product) of the live and stored characteristics (which can be represented as deterministic signals), calculate an error (e.g., mean-squared error) between the live and stored characteristics, or perform another similarity analysis to determine a similarity between the live and stored characteristics. The controller 108 can use similar techniques to compare 506 characteristics of the acceleration signal 107 with the stored acceleration signal characteristics for the user, and to compare 508 a relationship between the signals 105, 107 and the previously captured relationship. By comparing three sets of characteristics, it becomes much more difficult for one user to imitate or be confused with another user relative to techniques using only one or two of these comparisons) (see Column 5, line 66, to Column 7, line 51). In view of the teaching of Littrell, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have used a machine learning module in order to further increase the motion detection accuracy by improving the detection and comparison accuracy.
Regarding claim 17, Wiesbauer teaches a device comprising:
one or more processors configured to:
obtain the audio signal (i.e., MEMS microphones 342 and 344 receive pressure waves from an ambient environment and transduce the pressure waves to electrical signals. The electrical pressure signals are processed by audio processing circuit 348 including a filter and other audio processing circuits commonly used for microphone signal processing and then supplying audio signals to speaker or another component configured to receive audio signals) (see Column 8, line 21, to Column 10, line 64) based on detection of sound (i.e., MEMS microphones 342 and 344 receive pressure waves from an ambient environment and transduce the pressure waves to electrical signals) (see Column 8, line 21, to Column 10, line 64) by a microphone (i.e., MEMS microphones 342 and 344) (see Column 8, line 21, to Column 10, line 64);
obtain the motion signal (i.e., MEMS microphones 342 and 344 receive pressure waves from an ambient environment and transduce the pressure waves to electrical signals. The electrical pressure signals are processed by motion processing circuit 350) (see Column 8, line 21, to Column 10, line 64) based on detection of motion (i.e., MEMS microphones 342 and 344 receive pressure waves from an ambient environment and transduce the pressure waves to electrical signals) (see Column 8, line 21, to Column 10, line 64) by a motion sensor (i.e., MEMS microphones 342 and 344, wherein each pressure sensor of the plurality of pressure sensors may be one of a group consisting of a static pressure sensor, a dynamic pressure sensor, and a microelectromechanical (MEMS) microphone) (see Column 8, line 21, to Column 10, line 64) mounted on a surface of an object (i.e., pressure sensor may be coupled to a structure that exhibits a reproducible motion and detecting the motion may include detecting the reproducible motion) (see Column 8, line 21, to Column 10, line 64);
generate digital correlation data for the audio signal (i.e., MEMS microphones 342 and 344 receive pressure waves from an ambient environment and transduce the pressure waves to electrical signals. The electrical pressure signals are processed by motion processing circuit 350) (see Column 8, line 21, to Column 10, line 64);
generate digital correlation data for the motion signal (i.e., MEMS microphones 342 and 344 receive pressure waves from an ambient environment and transduce the pressure waves to electrical signals. The electrical pressure signals are processed by motion processing circuit 350) (see Column 8, line 21, to Column 10, line 64);
generate joint correlation data for the audio signal and the motion signal (i.e., comparing the first signal and the second signal may include determining a difference signal between the first signal and the second signal) (see Column 8, line 21, to Column 10, line 64); and
select a classification (i.e., such pressure signals from multiple sensors may be used to detect movements in any direction and the combinations of pressure signals may be used to differentiate between ambient pressure changes, sounds, and motion. For example, pressure signals detected simultaneously at three sensors may correspond to an ambient air pressure change or sound pressure signal) (see Column 3, line 1, to Column 4, line 28) based on the joint correlation data (i.e., characterizing a motion based on the comparing) (see Column 8, line 21, to Column 10, line 64); but does not explicitly teach a memory configured to store an audio signal and a motion signal.
Regarding the memory, Littrell teaches a memory (i.e., controller 108 can include one or more storage components (e.g., volatile memory, non-volatile memory, a hard drive, or combinations of them, among others)) (see Column 4, line 24, to Column 5, line 37) configured to store an audio signal (i.e., signals 105 produced by the acoustic transducer 104) (see Column 4, line 24, to Column 5, line 37) and a motion signal (i.e., signals 107 produced by the accelerometer 106) (see Column 4, line 24, to Column 5, line 37); and
one or more processors (i.e., controller 108 can include one or more processing components (e.g., a central processing unit (CPU), an application specific integrated circuit (ASIC), a logic circuit, or combinations of them, among others)) (see Column 4, line 24, to Column 5, line 37) configured to select a classification (i.e., determining (e.g., by the controller 108) that at least one of the characteristics of the acceleration signal and/or the acoustic signal correspond to at least one of the stored biometric characteristics with a threshold level of similarity (or determining that the user is not authenticated in response to characteristics of one or both of the signals not satisfying the threshold level of similarity to the stored biometric characteristics). In some examples, a control signal that controls a function of the device or another device (such as a function of an application or service executing on the device or the other device) is generated in response to authenticating the user. Such a signal can indicate whether or not the user is authenticated) (see Column 4, line 26, to Column 9, line 31) based on the joint correlation data (i.e., comparing characteristics of the acceleration signal and/or the acoustic signal with the stored biometric characteristics includes comparing characteristics of the accelerometer signal with characteristics of a previously captured voice accelerometer signal, comparing characteristics the acoustic signal with characteristics of a previously captured acoustic transducer signal, or comparing a relationship between the accelerometer signal and the acoustic signal with a relationship between the previously captured voice accelerometer signal and the previously captured acoustic transducer signal, or combinations of them) (see Column 4, line 26, to Column 9, line 31). In view of the teaching of Littrell, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have added a memory in order to store and retrieve the measured signals and algorithms for evaluating motion information.
Regarding claim 18, Wiesbauer as modified by Littrell as disclosed above does not directly or implicitly teach that the classification is further based on a magnitude of the audio signal and a magnitude of the motion signal. However, Littrell teaches that the classification is further based on a magnitude of the audio signal (i.e., the controller 108 can compare characteristics of the acoustic signal 105 with stored biometric characteristics for the user as described herein to determine whether the characteristics of the signal 105 sufficiently correspond to the stored characteristics) (see Column 7, lines 1-67) and a magnitude of the motion signal (i.e., the controller 108 can compare one or more features of the signal 107, such as an amplitude of the signal, an energy of the signal, or both, among others, with one or more thresholds. If some or all of the features of the signal 107 satisfy (e.g., exceed) a corresponding threshold, the controller 108 can determine that the user of the device 200 is speaking) (see Column 7, lines 1-67). In view of the teaching of Littrell, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have performed appropriate signal processing in order to further improve the accuracy of the motion detection.
Regarding claim 19, Wiesbauer as modified by Littrell as disclosed above does not directly or implicitly teach that the classification is selected from a first classification set including at a scratch classification, a dent classification, a touch classification, and a non-contact classification. However, Littrell teaches that the classification is selected from a first classification set including a touch classification, and a non-contact classification (i.e., since the voice accelerometer 106 may not contact the user's head in the same way every time, the controller 108 can prompt the user to remove and reinstall the device 200 during repetitions of the enrollment process in order to capture characteristics of the user's voice from a range of contact levels that the device 200 may have with the user's head) (see Column 5, lines 5-37). In view of the teaching of Littrell, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have established a relationship between the audio signal and the motion signal in order to further improve the motion detection accuracy.
Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Wiesbauer et al. (Pat. No. US 9,945,746) (hereafter Wiesbauer) in view of Littrell et al. (Pat. No. US 12,216,746) (hereafter Littrell) and in further view of Wingate et al. (Pub. No. US 2014/0226838) (hereafter Wingate)
Regarding claim 20, Wiesbauer as modified by Littrell as disclosed above does not directly or implicitly teach that the classification includes a first value from the first classification set and a second value from a second classification set, the second classification set including a damage classification and a non-damage classification.
Regarding the second classification set, Littrell teaches that the classification includes a first value from the first classification set (i.e., since the voice accelerometer 106 may not contact the user's head in the same way every time, the controller 108 can prompt the user to remove and reinstall the device 200 during repetitions of the enrollment process in order to capture characteristics of the user's voice from a range of contact levels that the device 200 may have with the user's head) (see Column 5, lines 5-37) and a second value from a second classification set (i.e., the controller 108 can determine whether the characteristics of the live signal correspond to the stored biometric characteristics for the user within a threshold level of similarity. In some examples, the controller 108 can apply a Gaussian mixture model or another probabilistic model, a neural network or another machine learning model, or combinations of them, among others to the live and stored characteristics to determine whether they sufficiently correspond to one another. As another example, the controller 108 can compute a cross-correlation (e.g., sliding dot product) of the live and stored characteristics (which can be represented as deterministic signals), calculate an error (e.g., mean-squared error) between the live and stored characteristics, or perform another similarity analysis to determine a similarity between the live and stored characteristics. The controller 108 can use similar techniques to compare 506 characteristics of the acceleration signal 107 with the stored acceleration signal characteristics for the user, and to compare 508 a relationship between the signals 105, 107 and the previously captured relationship. By comparing three sets of characteristics, it becomes much more difficult for one user to imitate or be confused with another user relative to techniques using only one or two of these comparisons) (see Column 5, line 25, to Column 6, line 67). In view of the teaching of Littrell, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have computed further similarity analysis in order to further increase the motion detection accuracy.
Regarding the damage and non-damage classification, Wingate teaches that the second classification set including a damage classification and a non-damage classification (i.e., multi-element microphones may be useful in other application areas in which a separation of a signal by a combination of sound structure and direction of arrival can be used. For example, acoustic sensing of machinery (e.g., a vehicle engine, a factory machine) may be able to pinpoint a defect, such as a bearing failure not only by the sound signature of such a failure, but also by a direction of arrival of the sound with that signature. In some cases, prior information regarding the directions of machine parts and their possible failure (i.e., noise making) modes are used to enhance the fault or failure detection process. In a related application, a typically quiet environment may be monitored for acoustic events based on their direction and structure, for example, in a security system) (see paragraph sections [0144]-[0148]). In view of the teaching of Wingate, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have further classified the status of the device, such as failure or damage, in order to confirm the reliability of the motion detection.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: see PTO-892.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TRAN M. TRAN whose telephone number is (571)270-0307. The examiner can normally be reached Mon-Fri 11:30am - 7:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Laura Martin can be reached on (571)-272-2160. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Tran M. Tran/Examiner, Art Unit 2855