Last updated: July 17, 2026
Application No. 18/634,351
RESPONSE DETERMINING METHOD AND APPARATUS

Final Rejection §103
Filed
Apr 12, 2024
Priority
Oct 15, 2021 — CN 202111205658.2 +1 more
Examiner
MEIS, JON CHRISTOPHER
Art Unit
2654
Tech Center
2600 — Communications
Assignee
Huawei Technologies Co., Ltd.
OA Round
2 (Final)
This examiner grants 34% of cases after interview

— +47.0% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 29 resolved cases, 2023–2026
Examiner Intelligence

MEIS, JON CHRISTOPHER View full profile →
Grants only 34% of cases
Career Allowance Rate
10 granted / 29 resolved
-27.5% vs TC avg
Strong +47% interview lift
Without
With
+47.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 10m
Avg Prosecution
14 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§103
98.7%
+58.7% vs TC avg
§102
1.3%
-38.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 29 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
Claims 1-4, 6-9, 11-14, 16-19 and 21-24 are pending.  Claims 1, 6, 11, and 16 are independent.
This Application was published as US 20240256789.
Apparent priority is 15 October 2021.
The instant Application is directed to a method of responding to queries by determining the dialog type and inputting the dialog type along with the dialog into a response generation network.
Applicant’s amendments and arguments are considered but are either unpersuasive or moot in view of the new grounds of rejection that, if presented, were necessitated by the amendments to the Claims.
This action is Final.

Response to Amendment
Applicant’s amendments to the claims have overcome the objection to claim 5.

Response to Arguments
35 USC 102
Applicant’s arguments with respect to 35 USC 103 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.


Claim(s) 1-3, 6-8, 11-13, 16-18, and 22-24 is/are rejected under 35 U.S.C. 103 as being unpatentable over Liang (US 20220083742 A1) in view of Naidu (US 20120030228 A1).

Regarding claim 1, Liang discloses: 1. A response determining method, wherein the method comprises: obtaining a to-be-responded first user statement; ("[0083] The obtaining module is configured to obtain a current dialogue sentence input by a user." )
determining first state information of the first user statement based on the first user statement by using a state determining network, ("[0086] The first neural network module is configured to: use the current dialogue sentence and a goal type and a goal entity of a preceding dialogue sentence obtained before the current dialogue sentence as an input; generate a goal type and a goal entity of the current dialogue sentence through feature extraction." )
wherein the first state information comprises a first dialog type of the first user statement, and the first dialog type is a chit-chat dialog, a task- oriented dialog, a question answering dialog, or a retrieval dialog; and ("[0084] The neural network system is configured to: use the current dialogue sentence and knowledge base data as an input, and generate a reply sentence through feature extraction, where the reply sentence is a chitchat sentence, an answer sentence or a recommendation sentence." )
inputting the first user statement and the first dialog type into a response generation network, to obtain a response corresponding to the first user statement,   ("[0087] The second neural network module is configured to: use the current dialogue sentence, the goal type and the goal entity of the current dialogue sentence and the knowledge base data as an input; and generate the reply sentence through feature extraction and classification." )
wherein the inputting the first user statement and the first dialog type into a response generation network to obtain a response corresponding to the first user statement comprises: obtaining, from a database that is an external resource based on the first user statement, a keyword or a key sentence for constructing the response; and("[0086] The first neural network module is configured to: use the current dialogue sentence and a goal type and a goal entity of a preceding dialogue sentence obtained before the current dialogue sentence as an input; generate a goal type and a goal entity of the current dialogue sentence through feature extraction." )
inputting the first user statement, the first dialog type, and the keyword or the key sentence into the response generation network, to obtain the response corresponding to the first user statement. ("[0087] The second neural network module is configured to: use the current dialogue sentence, the goal type and the goal entity of the current dialogue sentence and the knowledge base data as an input; and generate the reply sentence through feature extraction and classification." )
	Liang does not explicitly disclose obtaining the keyword or key sentence from a database that is an external resource. 
	Naidu discloses: obtaining, from a database that is an external resource based on the first user statement, a keyword or a key sentence for constructing the response. (“[0042]... The entity extraction phase is performed to extract an entity associated with the need. The entity recognition phase is employed to perform matching of the entity associated with the need with a plurality of entities stored in a reference database system or in an external entity databases or entity recognition services. Examples of the reference database system can include, but are not limited to, a Wordnet, and a Freebase...)
Liang and Naidu are considered analogous art to the claimed invention because they disclose response generation methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Liang with an external database for the entity extraction as disclosed by Naidu. Doing so would have been beneficial so that the system could identify additional entities stored in the database and to reduce use of on-device storage. This combination falls under simple substitution of one known element for another to obtain predictable results or use of known technique to improve similar devices (methods, or products) in the same way. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Regarding claim 2, Liang discloses: 2. The method according to claim 1, wherein the determining first state information of the first user statement based on the first user statement by using a state determining network comprises: determining the first dialog type of the first user statement from a plurality of dialog types by using a state determining network, wherein the plurality of dialog types comprise at least two of the chit-chat dialog, the task-oriented dialog, the question answering dialog, or the retrieval dialog.   ("[0084] The neural network system is configured to: use the current dialogue sentence and knowledge base data as an input, and generate a reply sentence through feature extraction, where the reply sentence is a chitchat sentence, an answer sentence or a recommendation sentence." )

Regarding claim 3, Liang discloses: 3. The method according to claim 1, wherein the method further comprises: obtaining a to-be-responded second user statement; determining second state information of the second user statement based on the second user statement by using the state determining network, wherein the second state information comprises a second dialog type of the second user statement, the second dialog type is a chit-chat dialog, a task-oriented dialog, a question answering dialog, or a retrieval dialog, and the second dialog type is different from the first dialog type; and inputting the second user statement and the second dialog type into the response generation network, to obtain a response corresponding to the second user statement. ("[0081] In summary, the man-machine dialogue method provided by the example may fuse various types of dialogues by the neural network, so as to actively and naturally guide the man-machine dialogues from non-recommendation dialogues such as chitchat dialogues, question answering dialogues, task-based dialogues to recommendation dialogues, and fuse the knowledge base data naturally into the dialogue; and the technical solution may accurately generate and output to the user the reply sentences including chitchat sentences, answering sentences and recommendation sentences by the neural network, and then realize the recommendation goal through one or more dialogue interactions in time and accurately on the basis of the knowledge base data and the user interest obtained by analyzing the dialogue sentences input by the user, which may enhance the initiative, scalability and richness of the man-machine dialogue, thereby enhancing the user experience." – Liang discloses that the system can output a plurality of replies to a plurality of types of queries.)

Regarding claim 6, Liang discloses: 6. A response determining method, wherein the method comprises: obtaining a first user statement, a first dialog type of the first user statement, and a first response corresponding to the first user statement, wherein the first dialog type is a real type of the first user statement, and the first dialog type is a chit-chat dialog, a task-oriented dialog, a question answering dialog, or a retrieval dialog; (“[0024] obtaining the neural network system by performing training on training data, where the training data includes dialogue sequences, candidate reply sentences, the knowledge base data, and a target recommendation sequence, and the dialogue sequences include a chitchat dialogue, a question answering dialogue, and a recommendation dialogue.”)
determining first state information of the first user statement based on the first user statement by using a state determining network, wherein the first state information comprises a second dialog type of the first user statement; (“[0049] using the user dialogue sentence in the dialogue sequence in the training data as ContextX0, using the goal type GT0′ and the goal entity GE0′ of a preceding dialogue sentence obtained before the user dialogue sentence ContextX0 as an input of a first neural network module of the neural network system, generating, by performing the feature extraction through the first neural network module, the goal type GT0 and the goal entity GE0 of the user dialogue sentence ContextX0;”)
inputting the first user statement and the first dialog type into a response generation network, to obtain a second response corresponding to the first user statement; (“[0050] using the user dialogue sentence ContextX0, the goal type GT0 and the goal entity GE0 of the user dialogue sentence ContextX0, and the knowledge base data {SRO1, SRO2, SRO3 . . . } as an input of a second neural network module of the neural network system, and generating a reply sentence Y by performing feature extraction and classification through the second neural network module.”)
wherein the inputting the first user statement and the first dialog type into a response generation network to obtain a second response corresponding to the first user statement comprises: obtaining, from a database that is an external resource based on the first user statement, a keyword or a key sentence for constructing the second response; and("[0086] The first neural network module is configured to: use the current dialogue sentence and a goal type and a goal entity of a preceding dialogue sentence obtained before the current dialogue sentence as an input; generate a goal type and a goal entity of the current dialogue sentence through feature extraction." )
inputting the first user statement, the first dialog type, and the keyword or the key sentence into the response generation network, to obtain the second response corresponding to the first user statement; ("[0087] The second neural network module is configured to: use the current dialogue sentence, the goal type and the goal entity of the current dialogue sentence and the knowledge base data as an input; and generate the reply sentence through feature extraction and classification." )
updating the state determining network based on a difference between the first dialog type and the second dialog type; and updating the response generation network based on a difference between the first response and the second response.   ("[0026] The training stage is used to determine, according to the training data, parameters of each network or model in the neural network system by maximizing a likelihood function on the training data through a back-propagation algorithm or a stochastic gradient descent algorithm. The usage stage is used to generate the reply sentence and return it to the user by having the current dialogue sentence input by the user as the input of the neural network system and performing calculations by the neural network system based on a knowledge base which has already been constructed." )
	Liang does not explicitly disclose obtaining the keyword or key sentence from a database that is an external resource. 
	Naidu discloses: obtaining, from a database that is an external resource based on the first user statement, a keyword or a key sentence for constructing the response. (“[0042]... The entity extraction phase is performed to extract an entity associated with the need. The entity recognition phase is employed to perform matching of the entity associated with the need with a plurality of entities stored in a reference database system or in an external entity databases or entity recognition services. Examples of the reference database system can include, but are not limited to, a Wordnet, and a Freebase...)
Liang and Naidu are considered analogous art to the claimed invention because they disclose response generation methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Liang with an external database for the entity extraction as disclosed by Naidu. Doing so would have been beneficial so that the system could identify additional entities stored in the database. This combination falls under simple substitution of one known element for another to obtain predictable results or use of known technique to improve similar devices (methods, or products) in the same way. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Regarding claim 7, Liang discloses: 7. The method according to claim 6, wherein the determining first state information of the first user statement based on the first user statement by using a state determining network comprises: determining the second dialog type of the first user statement from a plurality of dialog types by using the state determining network, wherein the plurality of dialog types comprise at least two of the chit-chat dialog, the task-oriented dialog, the question answering dialog, or the retrieval dialog.  ("[0084] The neural network system is configured to: use the current dialogue sentence and knowledge base data as an input, and generate a reply sentence through feature extraction, where the reply sentence is a chitchat sentence, an answer sentence or a recommendation sentence." )

Regarding claim 8, Liang discloses: 8. The method according to claim 6, wherein the method further comprises: obtaining a second user statement, a third dialog type of the second user statement, and a third response corresponding to the second user statement, wherein the third dialog type is a real type of the second user statement; determining second state information of the second user statement based on the second user statement by using the state determining network, wherein the second state information comprises a fourth dialog type of the second user statement, and the fourth dialog type is different from the third dialog type; inputting the second user statement and the third dialog type into the response generation network, to obtain a fourth response corresponding to the second user statement; updating the state determining network based on a difference between the fourth dialog type and the third dialog type; and updating the response generation network based on a difference between the fourth response and the third response.   (See claim 3. Liang further discloses multiple training samples: "[0033] The knowledge base is generated on the basis of some “facts”. The knowledge base includes records or “tuples”. Specifically, the knowledge base may be obtained from the Internet. In this example, the knowledge base data may be specifically composed of multiple triples. Web pages may be crawled from encyclopedia knowledge sites such as Baidu Encyclopedia, Interactive Encyclopedia, and Douban, and the structured triple may be obtained by analyzing a table in the web page. After further processing including denoising, merging and so on, multiple triples are extracted to form the knowledge base" - see also "[0024] obtaining the neural network system by performing training on training data, where the training data includes dialogue sequences, candidate reply sentences, the knowledge base data, and a target recommendation sequence, and the dialogue sequences include a chitchat dialogue, a question answering dialogue, and a recommendation dialogue." )

Claim 11 is an apparatus claim with limitations corresponding to the limitations of Claim 1 and is rejected under similar rationale.  Additionally, at least one processor; and one or more memories of the Claim are taught by Liang ("CPU"; "RAM"; "ROM" Fig. 5).	

Claim 12 is an apparatus claim with limitations corresponding to the limitations of Claim 2 and is rejected under similar rationale.  

Claim 13 is an apparatus claim with limitations corresponding to the limitations of Claim 3 and is rejected under similar rationale.  

Claim 16 is an apparatus claim with limitations corresponding to the limitations of Claim 6 and is rejected under similar rationale.  Additionally, at least one processor; and one or more memories of the Claim are taught by Liang ("CPU"; "RAM"; "ROM" Fig. 5).	

Claim 17 is an apparatus claim with limitations corresponding to the limitations of Claim 7 and is rejected under similar rationale.  

Claim 18 is an apparatus claim with limitations corresponding to the limitations of Claim 8 and is rejected under similar rationale.  

Regarding claim 22, Liang does not explicitly disclose a task-oriented dialog.
Naidu discloses: 22. The method according to claim 2, wherein the plurality of dialog types comprise the task-oriented dialog. (“[0028] … Examples of classifications include, but are not limited to, a personal need, an informational need and a social need. The need expressed by the user may be provided to the system 120 through the network 110A. The system 120 may process the need. Upon processing, the system 120 may provide one or more recommendations. In one embodiment, the one or more recommendations represent one or more actionable tasks that can be accomplished to fulfill the need of the user.)
Liang and Naidu are considered analogous art to the claimed invention because they disclose response generation methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Liang with a task oriented dialog option as disclosed by Naidu. Doing so would have been beneficial so that the system could meet the needs of the user. (Naidu [0028]) 

Regarding claim 23, Liang discloses: 23. The method according to claim 1, wherein the state determining network and the response generation network each is an initialized model in a model training start phase. (“[0025] It should be understood that the man-machine dialogue method provided in this example is a man-machine dialogue method based on the neural network system. As a machine learning model, the neural network system may be divided into two stages: a training stage and a usage stage. [0026] The training stage is used to determine, according to the training data, parameters of each network or model in the neural network system by maximizing a likelihood function on the training data through a back-propagation algorithm or a stochastic gradient descent algorithm...” – the models in the training stage read on an initialized model.)

Regarding claim 24, Liang discloses: 24. The method according to claim 6, wherein the state determining network and the response generation network each is an initialized model in a model training start phase. (“[0025] It should be understood that the man-machine dialogue method provided in this example is a man-machine dialogue method based on the neural network system. As a machine learning model, the neural network system may be divided into two stages: a training stage and a usage stage. [0026] The training stage is used to determine, according to the training data, parameters of each network or model in the neural network system by maximizing a likelihood function on the training data through a back-propagation algorithm or a stochastic gradient descent algorithm...” – the models in the training stage read on an initialized model.)

Claim(s) 4, 9, 14, and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Liang in view of Naidu as applied to claim 1 above, further in view of Dymetman et al. (US 20220108081 A1).

Regarding claim 4, Liang and Naidu do not disclose the additional limitations. 
Dymetman discloses: 4. The method according to claim 1, wherein the state determining network and the response generation network each are a generative pre-trained transformer (GPT) model, a dialogue generative pre-trained transformer (DialoGPT) model, a bidirectional and auto-regressive transformer (BART) model, or a transfer text-to-text transformer (T5) model.   ("[0006] Language models (LMs), in a strict sense, such as GPT-2 and GPT-3, or in an extended sense, such as BERT, pre-trained on large datasets of text, are well known in natural language processing. Such pre-trained language models are typically seen as stores of generic knowledge that can be used for downstream tasks through fine-tuning, often done by providing a small amount of task-specific training data and extending or modifying parameters of pre-trained language models..." )
Liang, Naidu, and Dymetman are considered analogous art to the claimed invention because they disclose methods of text generation. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination with a GPT model for the neural networks as disclosed by Dymetman. Doing so would have been beneficial because it has stores of generic knowledge. (Dymetman [0006]) This combination falls under combining prior art elements according to known methods to yield predictable results or simple substitution of one known element for another to obtain predictable results. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Regarding claim 9, Liang and Naidu do not disclose the additional limitations. 
Dymetman discloses: 9. The method according to claim 6, wherein the state determining network and the response generation network each are a generative pre-trained transformer (GPT) model, a dialogue generative pre-trained transformer (DialoGPT) model, a bidirectional and auto-regressive transformer (BART) model, or a transfer text-to-text transformer (T5) model.   ("[0006] Language models (LMs), in a strict sense, such as GPT-2 and GPT-3, or in an extended sense, such as BERT, pre-trained on large datasets of text, are well known in natural language processing. Such pre-trained language models are typically seen as stores of generic knowledge that can be used for downstream tasks through fine-tuning, often done by providing a small amount of task-specific training data and extending or modifying parameters of pre-trained language models..." )
See claim 4 for motivation statement.

Claim 14 is an apparatus claim with limitations corresponding to the limitations of Claim 4 and is rejected under similar rationale.  

Claim 19 is an apparatus claim with limitations corresponding to the limitations of Claim 9 and is rejected under similar rationale.  

Claim(s) 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Liang in view of Naidu as applied to claim 1 above, further in view of Kennewick et al. (US 20160110347 A1).

Regarding claim 21, Liang and Naidu do not disclose the additional limitations.
Kennewick discloses: 21. The method according to claim 1, wherein the obtaining a to-be-responded first user statement comprises: capturing and analyzing body movement of a user; and identifying the first user statement in a text form based on the body movement of the user. (“[0040] Speech recognition engine(s) 220 may process one or more inputs received from input device(s) 210 to recognize one or more words represented by the received inputs. … In other examples, natural language processing engine(s) 230 may process one or more other types of inputs (e.g., visual input representing sign language communication, gestures, or other forms of communication) to recognize one or more words represented by the other types of inputs.”)
Liang, Naidu, and Kennewick are considered analogous art to the claimed invention because they disclose methods of response generation. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination with a visual input as disclosed by Kennewick. Doing so would have been beneficial so a speech impaired user could interact with the system.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JON C MEIS whose telephone number is (703)756-1566. The examiner can normally be reached Monday - Thursday, 8:30 am - 5:30 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hai Phan can be reached at 571-272-6338. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/JON CHRISTOPHER MEIS/Examiner, Art Unit 2654         

/HAI PHAN/Supervisory Patent Examiner, Art Unit 2654
Read full office action
Prosecution Timeline

Apr 12, 2024
Application Filed
Jan 14, 2026
Non-Final Rejection mailed — §103
Apr 13, 2026
Response Filed
Jun 29, 2026
Final Rejection mailed — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/881,473
Patent 12603087
VOICE RECOGNITION USING ACCELEROMETERS FOR SENSING BONE CONDUCTION
3y 8m to grant Granted Apr 14, 2026
18/303,296
Patent 12579975
Detecting Unintended Memorization in Language-Model-Fused ASR Systems
2y 11m to grant Granted Mar 17, 2026
17/979,989
Patent 12482487
MULTI-SCALE SPEAKER DIARIZATION FOR CONVERSATIONAL AI SYSTEMS AND APPLICATIONS
3y 0m to grant Granted Nov 25, 2025
18/020,514
Patent 12475312
FOREIGN LANGUAGE PHRASES LEARNING SYSTEM BASED ON BASIC SENTENCE PATTERN UNIT DECOMPOSITION
2y 9m to grant Granted Nov 18, 2025
18/065,374
Patent 12430329
TRANSFORMING NATURAL LANGUAGE TO STRUCTURED QUERY LANGUAGE BASED ON MULTI-TASK LEARNING AND JOINT TRAINING
2y 9m to grant Granted Sep 30, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
34%
Grant Probability
82%
With Interview (+47.0%)
2y 10m (~7m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 29 resolved cases by this examiner. Grant probability derived from career allowance rate.