Notice of Pre-AIA or AIA Status
1. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
2. 35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
3. Claims 1-14 and 17-18 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception without significantly more.
Taking independent claim 1 as representative, the claims recite, in part and at least, the following:
An apparatus, comprising:
at least one processor; and
at least one memory storing instructions that, when executed by the at least one processor, causes the apparatus at least to:
receive data indicative of a positive or negative classification based on comparing an output value, generated by a computational model responsive to an input data, with a threshold value which divides a range of output values of the computational model into positive and negative classes of output values, a positive or a negative classification being usable by the apparatus, or another apparatus, to trigger one or more processing operations;
determine that the positive or negative classification is a false classification based on one or more events detected subsequent to generation of the output value; and
update the threshold value responsive to determining that the positive or negative classification is a false classification.
The claim comprising an apparatus as recited satisfies the requirements of Step 1 of the Subject Matter Eligibility Test (MPEP 2106) determines whether the claimed invention is directed to a process, machine, manufacture or composition of matter. In this instance, claim 1’s apparatus qualifies as a machine under the Step.
Turning to Step 2A, Prong One, the test determines whether the claim recites an abstract idea, law of nature, or natural phenomenon. MPEP 2106.04 (I-II) and 2106.04(a). Claim 1 is reproduced again just below, this time with the limiting features that are directed to an abstract idea (to be discussed in 2A Prong One) in bold and the features that are directed to an additional element (to be discussed in 2A Prong Two and 2B) with underling:
An apparatus, comprising:
at least one processor; and
at least one memory storing instructions that, when executed by the at least one processor, causes the apparatus at least to:
receive data indicative of a positive or negative classification based on comparing an output value, generated by a computational model responsive to an input data, with a threshold value which divides a range of output values of the computational model into positive and negative classes of output values, a positive or a negative classification being usable by the apparatus, or another apparatus, to trigger one or more processing operations;
determine that the positive or negative classification is a false classification based on one or more events detected subsequent to generation of the output value; and
update the threshold value responsive to determining that the positive or negative classification is a false classification.
In view of the bolded features, the claim in principally directed to an active determining and updating step, as following a detailed data receiving step (which the Examiner will address after this re: Step 2A Prong Two and Step 2B).
Regarding the determining step, the determination of a classification as a false classification on some basis/information, e.g. events detected subsequent to the generation of the computation model’s output, is an evaluation or judgment. In other words, a mental step and hence an abstract idea.
Regarding the updating step, the changing of the threshold value based on the determining step is essentially changing a parameter for the determining step. For example, subject to this updating step, a model’s computation output is now compared to a different and updated threshold value. In terms of understanding the claim as essentially a mental process, this limitation is akin to changing/adjusting a boundary operative in making a binary classification. In other words, it is akin to tweaking or refining the mental step, and hence is an adjustment of an abstract idea.
Moving on to Step 2A, Prong Two, the test determines whether the claim recites additional elements that integrate the judicial exception into a practical application. MPEP 2106.04(d). The additional elements are as noted below:
Regarding the limitations for an apparatus with processor and memory elements to store instructions for the processor to execute, these are merely details describing a general purpose computer onto which the abstract idea is being practiced or exercised. These limitations have the effect of taking the abstract idea, e.g. the determining and the updating as discussed above, and merely saying “apply it” in a computer environment or platform. Hence, they are not considered sufficient to integrate the abstract idea into a practical application.
Regarding the limitation for receiving data of a classification result, the Examiner understands this as collecting data, which is extra-solution activity that is incidental to the abstract idea itself.
The recitation is detailed and provides clarification that the received data of the classification result is generated by a computational model responsive to input data; however, the Examiner notes this is a passively recited step and merely clarifies a basis for data being actively received. Moreover, even if construed as active, the computational model is merely math and hence again could be understood to be a broadening of the abstract idea as already discussed to now further encompass a high level mathematical computation step.
The recitation further clarifies that the threshold divides the output field into positive and negative classes is merely further detail that describes how the classification is performed. However, the classification itself is passively recited and merely clarifies a basis for data being actively received. Even if it were considered to be an active step, performing a binary classification of an input into two classes, when recited at this level of generality, is still akin to a mental step and hence an abstract idea.
The recitation further clarifies that the classification result is “usable by the apparatus, or another apparatus, to trigger one or more processing operations.” The Examiner notes again that this is not an active recitation. It merely provides some detail that the classification result, which is actively received, can be used for some purpose via some steps and those steps are not actively recited and hence fall outside the scope of the claim for the purposes of this analysis.
Finally, with Step 2B, the test evaluates the additional elements to determine whether they amount to significantly more than the judicial exception. MPEP 2106.05. Revisiting the additional elements as discussed above per Step 2A Prong Two, the Examiner does not believe the additional elements are persuasively sufficient to provide significantly more than the abstract idea as otherwise characterized by the Examiner. For example, the collection of data, no matter how much clarifying context is given of its basis, is still just a data collection step which is generally not understood to provide significantly more to an abstract idea. The recitations of the apparatus with processor and memory elements are likewise generally understood to merely take the abstract idea and “apply it” to a computer, and hence do not provide significantly more.
For these reasons, independent claim 1 as discussed above is understood by the Examiner as being an abstract idea without significantly more, and is hence rejected under 35 U.S.C. 101.
Independent claim 18 includes the same or similar limitations as claim 1 discussed above, but as recited in a method. Hence, it is essentially rejected under the same rationale as discussed above per claim 1.
At this time, the Examiner declines to reject dependent claims 15-16.
However, dependent claims 2-14 and 17-18 are rejected. They include in part the same features as claim 1 discussed above and do not otherwise cure its deficiencies.
Claims 2-3 merely recite that claim 1’s passive recitation of detecting events as used to determine a classification to be false is feedback data. As noted, the event detection aspect is not active and at best the claim merely clarifies that the result of the detection is a basis for an evaluation/judgement and that said result is part of a feedback feature/aspect. Hence, when understood this way, the further limitation does not provide anything to integrate the abstract idea further into a practical application or otherwise provide significantly more.
Claims 4-5 appear to clarify that a positive classification in addition to a determination of a negative interaction is defined as a false classification. In the Examiner’s view, these are a clarification of the abstract idea in that this merely refines an evaluation or judgement step that is just somewhat broader per claim 1.
Claims 6-7 are similar to claims 4-5 but address a different negative interaction event. Hence, in the Examiner’s view, these are a clarification of the abstract idea in that this merely refines an evaluation or judgement step that is just somewhat broader per claim 1.
Claims 8-14 relate in various degrees of detail as to how the threshold is adjusted. Each clarification as provided in each of these claims assumes the form of an algorithmic type step that merely serves to adjust or detail claim 1’s judgement/evaluation step, or an updating/refining thereof.
Claim 17 clarifies that claim 1’s apparatus includes a computational model, which at best is understood to provide a computation result that is the subject of classification for some input. In other words, the model, when recited at this level of generality, is just a mental step or process or even arithmetic/math, and on that basis does not meaningfully integrate the abstract idea as already identified further into a practical application or provide significantly more than the abstract idea.
Claim Rejections - 35 USC § 102
4. In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
5. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office Action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
6. Claims 1-9, 12, and 15-18 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by U.S. Patent Application Publication No. 2016/0077794 (“Kim”, as cited and made of record via Applicants’ IDS submission).
Regarding claim 1, KIM teaches An apparatus (Kim teaches a framework (FIG. 1) where a server system and user device work together to provide an always listening speech-based virtual assistant (Abstract, FIG. 3), and where the taught framework allows for a flexibility in terms of division of functionalities between the server and the user device ([0023]-[0027]) that essentially permits the reference’s features as mapped below to execute on either the server, the user device, or some combination of both), comprising:
at least one processor; and at least one memory storing instructions that, when executed by the at least one processor (regarding the processor, see FIG. 6 element 606 for a teaching of a processor on the user device (and the discussion per [0023]-[0027] to support an understanding that a processor on a server could likewise execute the same/similar functions/steps), and further regarding memory, see [0032]-[0034] for a teaching of memory elements that can be found in the server, the user device, or some combination of both for the storing of instructions in support of the instructions being executable to perform functions as mapped below), causes the apparatus at least to:
receive data indicative of a positive or negative classification based on comparing an output value, generated by a computational model responsive to an input data, with a threshold value which divides a range of output values of the computational model into positive and negative classes of output values, a positive or a negative classification being usable by the apparatus, or another apparatus, to trigger one or more processing operations (FIG. 3 teaching that audio input is received (i.e., “input data” as recited) and subject to sampling to determine a confidence level as to whether a wake word or its equivalent was received via the audio input (i.e., confidence level is the recited “output value”, and the processing that determines confidence level is the recited “computational model”), and that the confidence level is compared with a threshold to accordingly determine whether further processing via a triggered virtual assistant is warranted (i.e., “threshold value”, as recited, to divide confidence levels into [wake word was detected] or [wake word was not detected], and the functions performed by the triggered virtual assistant is akin to the recited “one or more processing operations”));
determine that the positive or negative classification is a false classification based on one or more events detected subsequent to generation of the output value; and update the threshold value responsive to determining that the positive or negative classification is a false classification (generally, see FIG. 3’s step 312 for a dynamic adjustment of the threshold in response to perceived events (e.g., to address the problems/need described in [0005]-[0007]), where in some instances the perceived events as contemplated are akin to “a false classification” as recited: in a first instance, [0051] discussing a use case where a trigger is missed for having a confidence level/score below the threshold but within a tolerance range of under the threshold, in which case the threshold can be selectively lowered to permit improved processing should the user repeat a trigger phase; and/or in a second instance, [0074] discussing a use case where a trigger for a wake word is missed but the command following the missed wake word is successfully recognized and classified, thereby suggesting that a wake word may have been missed and then taking the additional step to retroactively adjusting the trigger threshold to reevaluate the wake word and hence not miss what was missed in terms of triggering a voice assistant; and/or in a third instance, [0077] discussing a use case with a user responding to a false trigger with “not now” which then has the effect of adjusting the threshold).
Regarding claim 2, Kim teaches the apparatus of claim 1, further configured to determine a false classification based on feedback data indicative of the one or more events detected subsequent to generation of the output value (in further reliance of two of the three use cases discussed above in relation to claim 1, specifically citing the reference’s [0074] (where the successfully recognition of the command utterance is a feedback indicating that a trigger prior to it was missed and hence falsely classified) and [0077] (where the recognition and processing of the user’s “not now” subsequent to a false trigger is a feedback indicating that a trigger prior to it was false classified)).
Regarding claim 3, Kim teaches the apparatus of claim 2, wherein, responsive to a positive classification, the apparatus is further configured to determine a false classification based on the feedback data indicating a negative classification event associated with one or more further classification processes triggered by the positive classification (in further reliance of one of the three use cases discussed above in relation to claim 1, specifically citing the reference’s [0077] (where the recognition and processing of the user’s “not now” (i.e., a negative classification) subsequent to a false trigger is a feedback indicating that a trigger prior to it was falsely classified as a positive classification)).
Regarding claim 4, Kim teaches the apparatus of claim 2, wherein, responsive to a positive classification, the apparatus is further configured to determine a false classification based on the feedback data indicating a negative interaction event associated with the computational model (in further reliance of one of the three use cases discussed above in relation to claim 1, specifically citing the reference’s [0077] (where the recognition and processing of the user’s “not now” (i.e., a negative classification) subsequent to a false trigger is a feedback indicating that a trigger prior to it was falsely classified as a positive classification)).
Regarding claim 5, Kim teaches the apparatus of claim 4, wherein, in response to receiving no input data, or input data below a predetermined threshold, by the computational model within a predetermined time period subsequent to generation of the output value, the apparatus is configured to indicate a negative interaction event by the feedback data ([0057]: “For example, in response to receiving a notification (e.g., new email 434 shown in notification interface 432), the virtual assistant can be triggered to receive a user command from the audio input without detecting receipt of a spoken trigger. Should a user utter a command following the notification (e.g., “read that to me”), the user's intent can be determined and the command can be executed (e.g., the newly received email can be read out) without first requiring a particular speech trigger. Should a user not utter a command within a certain time, a user intent is unlikely to be determined from the input, and no virtual assistant action may take place. ... ”).
Regarding claim 6, Kim teaches the apparatus of claim 2, wherein, responsive to a negative classification, the apparatus is further configured to determine a false classification based on the feedback data indicating a repeat interaction event associated with the computational model ([0079]: “In some examples, the predetermined level of a speech trigger threshold can be adjusted over time based on usage. ... In contrast, based on user feedback of frequent missed triggers (e.g., repeated trigger phrases, manual assistant activation, manual indications, etc.), the predetermined level of a speech trigger can be lowered over time to make triggering more likely, and the level can continue to be lowered based on how frequent missed triggers are going forward.”).
Regarding claim 7, Kim teaches the apparatus of claim 6, wherein in response to the computational model receiving the same or similar input data to the previous input data within a predetermined time period subsequent to generation of the output value, the apparatus is further configured to indicate a repeat interaction event by the feedback data ([0079]: “In some examples, the predetermined level of a speech trigger threshold can be adjusted over time based on usage. ... In contrast, based on user feedback of frequent missed triggers (e.g., repeated trigger phrases, manual assistant activation, manual indications, etc.), the predetermined level of a speech trigger can be lowered over time to make triggering more likely, and the level can continue to be lowered based on how frequent missed triggers are going forward.”).
Regarding claim 8, Kim teaches the apparatus of claim 1, wherein:
for a false positive classification, the updated threshold value has a value within the positive class of output values; or for a false negative classification, the updated threshold value has a value within the negative class of output values ([0050]: “At block 312, the speech trigger confidence level threshold used at decision block 308 can be dynamically adjusted in response to perceived events. In some examples, the threshold can be lowered, which can increase the sensitivity of the trigger, thus increasing the likelihood that audio input will be recognized as a trigger. In other examples, the threshold can be raised, which can decrease the sensitivity of the trigger, thus decreasing the likelihood that audio input will be recognized as a trigger. The threshold can be dynamically raised and/or lowered in response to a variety of perceived events, conditions, situations, and the like. For example, the threshold can be lowered to increase the likelihood of triggering when perceived events suggest that a user is likely to utter a speech trigger, a false trigger would be minimally inconvenient (e.g., while in a noisy environment), a missed trigger would be relatively inconvenient (e.g., while driving), or the like. Conversely, the threshold can be raised to decrease the likelihood of triggering when perceived events suggest that a user is unlikely to utter a speech trigger, a false trigger would be especially undesirable (e.g., during an important meeting or in a quiet environment), a missed trigger would be minimally inconvenient, or the like.”).
Regarding claim 9, Kim teaches the apparatus of claim 8, wherein updating the threshold value comprises modifying the threshold value by a predetermined amount d within the respective positive or negative classes of output values and using the modified threshold value as the updated threshold value if the modified threshold value satisfies a predetermined rule ([0056]: “Any type of notification can be detected to cause the speech trigger threshold to be lowered, including incoming calls, new messages, application notifications, alerts, calendar reminders, or the like. In addition, in some examples, different types of notifications can cause the threshold to be lowered by different amounts. For example, the threshold can be lowered more for an incoming call than for an application notification because a user may be more likely to issue voice commands associated with an incoming call than for an application notification. In other examples, the threshold can be raised in response to detecting a notification. For example, the threshold can be raised in response to a calendar reminder that a meeting is starting such that the virtual assistant can be less likely to falsely trigger and interrupt the meeting. Notifications can thus be used in a variety of ways to adjust the speech trigger threshold.”, from which the Examiner understands that different rules that exist that serve to adjust the threshold effectively map different circumstances/scenarios to different threshold amount differences/adjustments).
Regarding claim 12, Kim teaches the apparatus of claim 9, wherein the predetermined amount d is dynamically changeable based, at least in part, on the generated output value of the computational model ([0051]: “... A user may be likely to retry and speak the trigger phrase a second time (e.g., more clearly, louder, closer to a device, etc.), and it can be desirable to ensure that the repeated trigger is not missed again. The confidence level determined during the missed trigger can be within a predetermined range of the threshold without exceeding the threshold, such as a few points below or within a certain percentage of the threshold. For instance, for a threshold of 70, a predetermined range can be defined as 50-69. In this example, the missed trigger may have been scored at a 65, which would be insufficient to trigger the virtual assistant. Given the perceived trigger within the predetermined range of 50-69, however, the threshold can be lowered such that the repeated trigger phrase can be more likely to trigger the virtual assistant the second time the user utters the phrase. The threshold can thus be lowered based on a prior sampling of the audio input having a confidence level within a predetermined range of the threshold without exceeding the threshold. It should be appreciated that the predetermined range can be determined empirically, set by a user, or the like.”).
Regarding claim 15, Kim teaches the apparatus of claim 1, wherein the apparatus comprises at least part of a digital assistant ([0017]: speech-driven “virtual assistant”), wherein the computational model is configured to receive input data representing at least one of a user utterance or a user gesture ([0017]: “speech trigger”) and to generate an output value indicative of whether the at least one of the user utterance or the user gesture corresponds to a wakeup command for the digital assistant, a positive classification being usable to trigger one or more processing operations for performance by the digital assistant ([0042]: “At block 306, a confidence level can be determined that the sampled audio input comprises a portion of a spoken trigger. In one example, sampled portions of audio can be analyzed to determine whether they include portions of a speech trigger. Speech triggers can include any of a variety of spoken words or phrases that a user can utter to trigger an action (e.g., initiating a virtual assistant session). For example, a user can utter “hey Siri” to initiate a session with a virtual assistant referred to as “Siri.” In other examples, a user can utter a designated assistant name or reference, such as “Assistant,” “Siri,” “Secretary,” “Hi Assistant,” “Hello Helper,” or any other names or references.”).
Regarding claim 16, Kim teaches the apparatus of claim 15, wherein the one or more processing operations comprise one or more of:
responding to a query received after the wakeup command ([0043]: “In other examples, spoken triggers can include commands, actions, queries, or other actionable words or phrases.” and “In other examples, a virtual assistant session can be initiated in response to such utterances and the associated functions can be executed without further spoken input from the user.”); or controlling a remote electronic system in communication with the digital assistant ([0020]: “ In other examples, speech trigger thresholds can be used to cause a variety of other actions and to initiate interactions with any of a variety of other systems and devices ...”).
Regarding claim 17, Kim teaches the apparatus of claim 1, wherein the apparatus comprises the computational model (a model or processing module, e.g. to assign confidence etc., may exist either on the user device or on the server ([0027])) and one or more sensors for providing the input data to the computational model ([0028]-[0030], [0033], and [0036]-[0037] discussing different sensor modalities operative in the taught framework, including for example a microphone).
Regarding claim 18, the claim includes the same or similar limitations as claim 1 discussed above, and is therefore rejected under the same rationale.
Allowable Subject Matter
7. Claims 10-11 and 13-14 are objected to as being dependent upon a rejected base claim, but would be allowable:
(1) if rewritten in independent form including all of the limitations of the base claim and any intervening claims
and also
(2) if Applicants are able to overcome the rejections made under 35 U.S.C. 101.
Conclusion
8. The prior art made of record and not relied upon is considered pertinent to Applicants’ disclosure:
US 20030125945 A1
US 9818407 B1
US 12567408 B2
US 20050144187 A1
US 20140316778 A1
US 20190187787 A1
US 20140278389 A1
CN 103971680 B
CN 110111775 A
JP 2016505888 A
WO 2015017303 A1
WO 2021022089 A1
CN 103971680 B
CN 110111775 A
JP 2016505888 A
WO 2015017303 A1
WO 2021022089 A1
9. Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHOURJO DASGUPTA whose telephone number is (571)272-7207. The examiner can normally be reached M-F 8am-5pm CST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached at 571 272 4241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SHOURJO DASGUPTA/Primary Examiner, Art Unit 2144