Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under, including the fee set forth in 37 CFR1.17(e), was filed in this application after final rejection. Since this application is eligiblefor continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e)has been timely paid, the finality of the previous Office action has been withdrawnpursuant to 37 CFR 1.114. Applicant's submission filed on 01/09/2026 has been entered.
Status of the Claims
Claims 1-20 are pending.
Response to Applicant’s Arguments
In response to “For example, Applicant respectfully submits that the proposed Kim-He combination fails to disclose or suggest to "identify, from among the query list, a corresponding query having a maximum relevance to the first user query" and to "identify a first semantic similarity between the second user query and the corresponding query having the maximum relevance", as claimed. The Office Action acknowledges that Kim does not disclose the cited features of claim 1. See Office Action, p. 6 (citing claim 1 prior to the present amendments). Instead, the Office Action relies on He as allegedly disclosing these features. Applicant respectfully submits that He does not make up for at least the above-discussed deficiencies of Kim”.
In view of amendments to claims 1, 11, and 20, rejections under Kim and He have been withdrawn. Upon further search and consideration, please see details of a new combination of references set forth below.
Claim Rejections - 35 USC § 103
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 103 that form the basis for the rejections under this section made in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 3-4, 9, 11, 13-14, 18, and 20 are rejected under 35 USC 103(a) as being unpatentable over Kim et al. (US 2020/0051554 A1) in view of Xu et al. (US 2021/0200813 A1).
Regarding Claims 1, 11, and 20, Kim discloses an electronic apparatus (Fig. 2; ¶56, electronic device 201) comprising:
a microphone (Fig. 2, microphone 288);
a speaker (Fig. 2, speaker 282);
a memory / non-transitory computer readable recording medium storing at least one instruction / program (Fig. 2, memory 230; Fig. 4, memory 450; ¶116 and ¶117, memory stores programs and instructions related to allowing processor to process wake-up word and process tasks); and
one or more processors (¶102, Fig. 2, processor 210, Fig. 4, processor 480);
wherein the one or more processors (Fig. 5, processor 480 comprises first processor 510 and second processor 520) are configured to execute the at least one instruction to:
based on identifying that a first voice sensed through the microphone corresponds to a wake-up voice, convert a state of the electronic apparatus from a standby state to a wake-up state (¶¶174-75 and Fig. 7, processor 480 (first processor / wake-up processing unit) detects a wake-up word and activates a voice recognition service; e.g., Fig. 12 and ¶¶265-66, voice recognition service being inactive state as of step 1201 and where upon recognizing a wake-up word through a microphone, activate voice recognition service),
based on a second voice being sensed while the state of the electronic apparatus is the wake-up state, identify a first user query included in first voice data obtained based on the second voice (¶176, processor 480 processes a first task on the basis of first voice information / user’s utterance; ¶266, enter a follow-up command processing mode capable of detecting a voice command),
perform a first operation corresponding to the first user query (¶176, processor 480 processes music application execution and music playback in response to “jazz music playback” or increase a volume value of the music application in response to “volume up”; ¶267, process a related command (e.g., task) on the basis of the voice recognition result and process first task related to execution of a function corresponding to the first voice information),
identify a predetermined query corresponding to the first user query (¶274 and ¶276, upon receiving new voice information (e.g., second voice information), convert the user’s utterance (i.e., first voice information and second voice information) into sentence or word and determine whether a pre-set domain list (Fig. 9) has a sentence corresponding to the converted sentence),
obtain a query list comprising at least one query based on the predetermined query and user context information (¶276, analyze a context of the first voice information and determine whether the pre-set domain list has a sentence corresponding to the converted sentence (i.e., first voice information)),
identify a second user query included in second voice data obtained based on a third voice sensed through the microphone (¶276, record user utterance (second voice information per ¶276) and convert it into a sentence or a word to determine whether the pre-set domain list has a sentence corresponding to the converted sentence), and
based on a first similarity between the second user query and the at least one query of the query list, perform a second operation corresponding to the second user query (¶276, compare and analyze a context of first voice information (previous dialog context) with context of second voice information (additional dialog context); ¶¶277-78, determine that the command (second voice information) corresponds to the same function as a function of a previously processed command (first voice information), process a corresponding command in association with a function performed depending on a previous command) and maintain the state of the electronic apparatus as the wake-up state (¶¶242-43, check activation wait time set for recognition of follow-up command or after a task is performed following wake-up and continue to receive audio input corresponding to user’s utterance during activation wait time).
Kim does not disclose identify a relevance to the first user query for each of the at least one query of the query list based on the predetermined query and the user context information and identify, from among the query list, a corresponding query having a maximum relevance to the first user query.
Xu discloses an electronic apparatus (Fig. 6) identifying a first user query included in first voice data (Fig. 1, step 101, ¶22, and Fig. 3, step 204, ¶46, obtain conversation sentence input by a user; per ¶24, user may interact with a smart device through text or by voice; e.g., ¶27, “I heard that something happened to Boeing recently”), identify a predetermined query corresponding to the first user query (¶23, Fig. 1, step 102 and ¶47, Fig. 3, step 205, obtain a query sentence matching the conversation sentence; ¶27, “Vice President of Boeing Apologizes” is the matching query sentence to “I heard that something happened to Boeing recently”), obtain a query list comprising at least one query based on the predetermined query and user context information (¶24, input the conversation sentence based on personal characteristics; ¶26, obtain a plurality of associated query sentences corresponding to the query sentence matching the conversation sentence (which was inputted based on personal characteristics) from a preset query word graph; i.e., the associated query sentences were obtained based on (1) matching query sentence and (2) conversation sentence inputted based on personal characteristics),
identify a relevance to the first user query for each of the at least one query of the query list based on the predetermined query and the user context information (¶28, Fig. 1, step 103 and Fig. 3, ¶51, step 207, process conversation sentence and the plurality of associated query sentences through a preset algorithm to select a target query sentence; per ¶32 and ¶50, obtain contextual sentence vector corresponding to the conversation sentence and a plurality of associated query sentence vectors, calculate similarities between the contextual sentence vector and the plurality of associated query sentence vectors to obtain relevance scores between the conversation sentence and the plurality of associated query sentences),
identify, from among the query list, a corresponding query having a maximum relevance to the first user query (¶53, the higher the relevance score, the stronger the relevance between the conversation sentence and the corresponding associated query sentence so that the associated query sentence having the highest relevance score is determined as the target query sentence; e.g., ¶35, “A Plane Crash of Boeing 737 in Indonesia” is determined as target query sentence to generate the response “CEO of Boeing Apologizes for the Plane Crash of Boeing 737 in Indonesia”)
identify a second user query included in second voice data (¶54, the conversation sentence “why did he apologize?”),
identify a first semantic similarity between the second user query and the corresponding query having the maximum relevance (¶54, obtain target query sentence “A Plane Crash of Boeing 737 in Indonesia” based on contextual sentence corresponding to “Why did he apologize?”; per ¶32, comparing the contextual sentence vector “Why did he apologize?” to target query sentence vector “A Plane Crash of Boeing 737 in Indonesia” to determine relevance score thereof),
based on the first semantic similarity (¶54, obtain “A Plane Crash of Boeing 737 in Indonesia” as the target query sentence means “A Plane Crash of Boeing 737 in Indonesia” is the query sentence that has the highest relevance score) being greater than or equal to a predetermined value (¶54 in view of ¶51, amongst query sentences with relevance scores, target query sentence “A Plane Crash of Boeing 737 in Indonesia” with highest relevance score means the target query sentence relevance score is higher than relevance scores of other query sentences), and perform a second operation corresponding to the second user query (¶55, process the target query sentence to generate a response sentence for the user).
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to identify a relevance to the first user query for each of the at least one query of the query list based on the predetermined query and the user context information and identify, from among the query list, a corresponding query having a maximum relevance to the first user query to satisfy user need and improve conversation effect (Xu, ¶54).
Regarding Claims 3 and 13, Kim discloses wherein the one or more processors are further configured to execute the at least one instruction to:
identify a domain of the first user query, and identify the predetermined query based on the domain (¶276, record user’s utterance, convert it into a sentence or a word, and determine whether a pre-set domain list (e.g., a contextual command for each domain) has a sentence corresponding to the converted sentence).
Regarding Claims 4 and 14, Kim discloses wherein the user context information comprises at least one of a query history of a user (Kim, ¶198, analyzing a context based on previously input command; compare He, ¶45), a query response history of the user (Kim, ¶¶204-205, determine an association of task by analyzing a previous conversational context based on a previously input command and additional conversational context on the basis of a pre-set domain list, the previous conversational context includes a task executed according to a previous command; compare He, ¶44), a location of the user, a current time, an ambient temperature, a current state of the electronic apparatus, or a use history of the electronic apparatus by the user.
Regarding Claims 9 and 18, Kim discloses wherein the one or more processors are further configured to execute the at least one instruction to:
identify an elapsed time from a first time point when the second voice is sensed through the microphone to a second time point when the third voice is sensed through the microphone (¶177, processor 480 counts an activation wait time in sequence with first task processing), and
based on the elapsed time being less than or equal to a predetermined time (¶178, determine whether a time which is set to the activation wait time elapses), identify the second user query from the second voice data obtained based on the third voice (¶181 and Fig. 7, upon determining that the timeout does not occur in operation 709, determine whether second voice information is detected in operation 715 -> yes -> ¶¶183-84).
Claims 2 and 12 are rejected under 35 USC 103(a) as being unpatentable over Kim et al. (US 2020/0051554 A1) and Xu et al. (US 2021/0200813 A1) as applied to claims 1 and 11, in view of He et al. (US 2021/0191952 A1).
Regarding Claims 2 and 12, Kim does not disclose wherein the one or more processors are further configured to execute the at least one instruction to:
based on the first semantic similarity being less than the predetermined value, change the state of the electronic apparatus from the wake-up state to the standby state without performing the second operation corresponding to the second user query.
He discloses an electronic apparatus (Figs. 7-8) identifying a first user query included in first voice data (¶¶37-38, collect expression input from the user; e.g., Fig. 1, “Play an English song”), perform a first operation corresponding to the first user query (¶40, using NLU model to perform semantic understanding to obtain an NLU parsing result comprising a domain, an intent, and a slot of the expression input by the user (the slot is used for representing key information describing the intent in the expression input by the user; see ¶54, step 301); e.g., ¶41, domain = music, intent = play music, and slot = [language of the song-English]), identify a predetermined query corresponding to the first user query (¶55 and ¶66, step 302, acquire second key information set corresponding to at least one historical query and determine whether the intent of the first query is the same as or related to the intent of the historical query), obtain a query list comprising at least one query based on the predetermined query and user context information (per ¶¶58-59, assuming the first query currently input by the user is an expression in an N-th round, the second key information set includes slot information in historical queries of previous (N-1) rounds are maintained in a cache; in view of ¶45, it is necessary to understand the expression in the current round in relation to the expression in the previous round, i.e., to understand session semantics of the user in relation to a context),
identify a relevance to the first user query for each of the at least one query of the query list based on the predetermined query and the user context information (¶64, comprehensively consider the first key information set and the second key information set to determine a plurality pieces of candidate semantics; in particular ¶102, acquire a co-occurrence probability between each piece of first key information in the first key information set and each piece of second key information in the subset to determine the conditional probability that the subset also appears under the condition that the first key information set appears),
identify a second user query included in second voice data (¶64 and Fig. 6, the user input “want a piece of jazz music” in the third round is the current input),
identify a first semantic similarity between the second user query and the at least one query of the query list based on the relevance to the first user query (¶60, when user inputs “want a piece of jazz music” in the third round, the first key information set is now {jazz} and acquire second key information set from the cache {English, hit} corresponding to historical queries “play an English song” (i.e., first user query) and “want a hit one”; ¶64 and ¶102, comprehensively consider the respective key information set to determine co-occurrence probability between each piece of first key information in the first key information set (corresponding to “want a piece of jazz music” or second user query) and each piece of second key information in the subset (which includes key information {English} corresponding to “play an English song” or first user query as well as other historical queries of previous (N-1) rounds are maintained in a cache) to determine the conditional probability that the subset also appears under the condition that the first key information set appears),
based on the first semantic similarity being greater than or equal to a predetermined value (¶119, use a preset threshold to filter probability scores so that only the candidate semantics with a probability score larger than the preset threshold need to be sorted), and perform a second operation corresponding to the second user query (¶141, human machine dialog apparatus carries out answer retrieval on the candidate semantics sequentially according to the sorted sequence; in view of ¶42, play a song for the user).
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to set a predetermined value to filter semantic similarity between second user query and at least one query of the query list based on an identified relevance to the first user query in order to consider only semantic similarity larger than a preset threshold (He, ¶119) when understanding expression / user query in a current round in relation to expression / user query in a previous round (He, ¶45).
As modified, based on the first semantic similarity being less than the predetermined value, change the state of the electronic apparatus from the wake-up state to the standby state without performing the second operation corresponding to the second user query (Kim, ¶189, if a follow-up command is not associated with a previous conversational context (i.e., Xu, ¶66 and ¶102, determine the intent of the first query is irrelevant to the intent of the historical query based on co-occurrence probability between respective queries’ key information not being larger than the preset threshold of ¶119), ignore a corresponding command; Kim, ¶¶242-43, check activation wait time set for recognition of follow-up command or after a task is performed following wake-up; i.e., standby to receive more audio until activation wait time ends).
Claims 6, 8, and 16-17 are rejected under 35 USC 103(a) as being unpatentable over Kim et al. (US 2020/0051554 A1) and Xu et al. (US 2021/0200813 A1) as applied to claims 1 and 11, in further view of Lloyd et al. (US 8521526 B1).
Regarding Claims 6 and 16, Kim discloses wherein the user context information comprises at least one of a query history of a user (Kim, ¶198, analyzing a context based on previously input command), a query response history of the user (Kim , ¶¶204-205, determine an association of task by analyzing a previous conversational context based on a previously input command and additional conversational context on the basis of a pre-set domain list, the previous conversational context includes a task executed according to a previous command), a location of the user, a current time, an ambient temperature, a current state of the electronic apparatus, or a use history of the electronic apparatus by the user, and
wherein the one or more processors are further configured to execute the at least one instruction to:
identify a domain of the first user query (Kim ¶276, record user’s utterance, convert it into a sentence or a word, and determine whether a pre-set domain list (e.g., a contextual command for each domain) has a sentence corresponding to the converted sentence),
identify the predetermined query based on the domain (Kim ¶276, record user’s utterance, convert it into a sentence or a word, and determine whether a pre-set domain list (e.g., a contextual command for each domain) has a sentence corresponding to the converted sentence).
The combination of Kim as modified by Xu does not teach allocate a predetermined weight corresponding to the domain to the user context information.
Lloyd discloses an electronic device to disambiguate spoken queries (Col 4, Rows 15-26 and see Fig. 1) by determining a user context information (Col 6, Rows 21-34, context data 215 associated with audio signal 214 comprises time and date when query term was received, type of client device 201, client device 201 states such as docked or holstered when query term was received), identify a domain of a first user query (Col 11, Rows 27-36, perform speech recognition on audio signal to select two or more candidate transcriptions and generating one or more n-grams from each candidate transcription; Col 12, Rows 33-35, determine a particular category that is associated with the n-gram)
identify a predetermined query based on the domain (Col 11, Rows 39-40 and Col 12, Rows 30-41, for each n-gram, determining a frequency with which the n-gram occurs in the past search queries by determining a quantity of past search queries that are associated with the particular category associated with the n-gram)
allocate a predetermined weight corresponding to the domain to the user context information (Col 12, Rows 20-23 and Col 16, Rows 18-20, determine weighting value for the frequency with which the n-gram occurs in the past search queries where more weights are given to queries that are from a context and device that matches the user’s current context and device when the spoken query term 212 is spoken; per Col 24, Rows 13-17, the frequency with which each n-gram occurs in the past search queries was determined based on category match), and
obtain the query list comprising the at least one query based on the predetermined query, the user context information, and the predetermined weight allocated to the user context information (Col 12, Rows 25-29 and see Fig. 2, determine a subset of weighted past search queries (Fig. 2, 220a-220i) that includes contexts which are similar to the context of the spoken query term).
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to allocate a predetermined weight corresponding to the domain to the user context information and obtain the query list comprising the at least one query based on the predetermined query, the user context information, and the predetermined weight allocated to the user context information in order to use a user’s recent search history to improve voice search accuracy (Lloyd, Col 11, Rows 19-21).
Regarding Claims 8 and 17, Kim as modified by Lloyd discloses wherein the one or more processors are further configured to execute the at least one instruction to:
identify a domain of the second user query (Kim, ¶276, record user utterance (second voice information per ¶276) and convert it into a sentence or a word to determine whether the pre-set domain list has a sentence corresponding to the converted sentence; compare Lloyd, Col 11, Rows 39-40 and Col 12, Rows 30-41),
based on the domain requiring user confirmation, control the speaker to output a voice requesting confirmation on whether the second operation corresponding to the second user query is to be performed (Kim, ¶308, device 400 may process audio output generated by device 400 through a speaker; implement Lloyd, Fig. 2, “Did you mean: Gym New York?” for input query “Jim Noo Ork”),
based on content approving to perform the second operation (Lloyd, Col 16, Rows 27-30, user may take an action indicating that he was satisfied with the accuracy of the speech recognition result), perform the second operation corresponding to the second user query (implement established function of Kim ¶276, compare and analyze a context of first voice information (previous dialog context) with context of second voice information (additional dialog context); ¶¶277-78, determine that the command (second voice information) corresponds to the same function as a function of a previously processed command (first voice information), process a corresponding command in association with a function performed depending on a previous command) and maintain the state of the electronic apparatus as the wake-up state (implement established function of Kim ¶¶242-43, check activation wait time set for recognition of follow-up command or after a task is performed following wake-up and continue to receive audio input corresponding to user’s utterance during activation wait time), the content being included in third voice data obtained based on a fourth voice sensed through the microphone (in view of Lloyd, Col 23, Rows 35-38, user may speak search query term; indicate satisfaction by speaking to choose the correct query terms), and
based on the content not approving to perform the second operation, change the state of the electronic apparatus from the wake-up state to the standby state and prevent performing the second operation corresponding to the second user query (Lloyd, Col 23, Rows 35-38, if the candidate transcriptions are not acceptable, user speaks a new search query term to restart the process; i.e., this required the established base device of Kim to stay in standby state to accept new queries without executing operations corresponding to unacceptable query processing results).
Claim 7 is rejected under 35 USC 103(a) as being unpatentable over Kim et al. (US 2020/0051554 A1) in view of Xu et al. (US 2021/0200813 A1) as applied to claim 1, in further view of Park et al. (US 11955123 B2).
Regarding Claim 7, Kim does not disclose wherein the one or more processors are further configured to execute the at least one instruction to: exclude, from the query list, an unperformable query corresponding to an operation that is not currently performable by the electronic apparatus.
Park teaches exclude, from a query list, an unperformable query corresponding to an operation that is not currently performable by an electronic apparatus (Col 8, Rows 1-13).
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to exclude, from a query list, an unperformable query corresponding to an operation that is not currently performable by the electronic apparatus to identify corresponding function / query that is not performable (Park, Col 8, Rows 18-21).
Claims 10 and 19 are rejected under 35 USC 103(a) as being unpatentable over Kim et al. (US 2020/0051554 A1) and Xu et al. (US 2021/0200813 A1) as applied to claims 1 and 11, in further view of Baeuml et al. (US 2023/0074406 A1).
Regarding Claims 10 and 19, Kim does not teach wherein the one or more processors are further configured to execute the at least one instruction to: obtain the query list comprising the at least one query based on a first vector value output by a query prediction model based on the predetermined query and the user context information.
Baeuml discloses an electronic apparatus (¶41 and Fig. 1, client device 110) identifying a first user query included in first voice data (¶33, capture audio data corresponding to user spoken utterances; ¶42, using NLU engine 140A1 to process steam of ASR outputs of the spoken utterances to generate a stream of NLU output), perform a first operation corresponding to the first user query (¶42, process the NLU output to generate a stream of fulfillment data that corresponds to a set of assistant outputs that are predicted to be responsive to an assistant query included in the captured spoken utterance), identify a predetermined query corresponding to the first user query (¶52, LLM engine 150A1 determines that the assistant query included in the spoken utterance matches corresponding assistant query and context 202 of the dialog session in which the assistant query is received), obtain a query list comprising at least one query based on the predetermined query and user context information (¶52, LLM engine 150A1 obtains one or more LLM outputs indexed by the corresponding assistant query and corresponding context that matches the assistant query; ¶65, index the one or more LLM output based on the given assistant query and/or corresponding context of the corresponding prior dialog session for the given assistant query in memory / LLM output database 150A; per ¶52, LLM output database 150A stores LLM outputs indexed to corresponding assistant queries and corresponding context of a corresponding dialog session) wherein obtain the query list comprising the at least one query based on a first vector value output by a query prediction model based on the predetermined query and the user context information (¶52, LLM engine 150A1 obtains one or more LLM outputs indexed by the corresponding assistant query and corresponding context that matches the assistant query (per ¶10, generate corresponding embedding for the assistant queries and map each of the corresponding embeddings to embedding space to index the one or more LLM outputs); ¶65, index the one or more LLM output based on the given assistant query and/or corresponding context of the corresponding prior dialog session for the given assistant query in memory / LLM output database 150A; per ¶52, LLM output database 150A stores LLM outputs indexed to corresponding assistant queries and corresponding context of a corresponding dialog session), wherein the query prediction model is trained based on a second vector value generated by the query prediction model based on providing a third user query and the user context information to the query prediction model (¶66, using the LLM outputs to modify or re-train the LLM).
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to obtain a query list comprising at least one query based on predetermined query and user context information by obtaining the query list comprising at least one query based on a first vector value output by a query prediction model based on the predetermined query and the user context information in order to leverage the current query and user context information to determine outputs responsive to the current query (Baeuml, ¶72).
Allowable Subject Matter
Claims 5 and 15 are objected to as being dependent upon a rejected base claim, but would be allowable upon being rewritten in independent form including all of the limitations of the base claim and any intervening claims for the following reason:
Prior arts of record do not disclose or render obvious
wherein the one or more processors are further configured to execute the at least one instruction to:
identify a second semantic similarity between the second user query and the corresponding query,
based on the second semantic similarity being greater than or equal to the predetermined value, perform the second operation corresponding to the second user query and maintain the state of the electronic apparatus as the wake-up state, and
based on the second semantic similarity being less than the predetermined value, convert the state of the electronic apparatus from the wake-up state to the standby state and prevent performing the second operation corresponding to the second user query.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to examiner Richard Z. Zhu whose telephone number is 571-270-1587 or examiner’s supervisor Hai Phan whose telephone number is 571-272-6338. Examiner Richard Zhu can normally be reached on M-Th, 0730:1700.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/RICHARD Z ZHU/Primary Examiner, Art Unit 2654 01/27/2026