Last updated: April 19, 2026

Application No. 18/725,725

KEYWORD DETECTION METHOD AND APPARATUS, AND ELECTRONIC DEVICE AND STORAGE MEDIUM

Non-Final OA §101§103

Filed

Jun 28, 2024

Examiner

HANG, VU B

Art Unit

2654

Tech Center

2600 — Communications

Assignee

BEIJING ZITIAO NETWORK TECHNOLOGY CO., LTD.

OA Round

1 (Non-Final)

Interview Optional

— +17.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 619 resolved cases, 2023–2026

Examiner Intelligence

HANG, VU B View full profile →

Grants 74% — above average

Career Allow Rate

461 granted / 619 resolved

+12.5% vs TC avg

Strong +18% interview lift

Without

With

+17.5%

Interview Lift

resolved cases with interview

Typical timeline

2y 11m

Avg Prosecution

17 currently pending

Career history

636

Total Applications

across all art units

Statute-Specific Performance

§101

15.2%

-24.8% vs TC avg

§103

52.8%

+12.8% vs TC avg

§102

18.7%

-21.3% vs TC avg

§112

5.4%

-34.6% vs TC avg

Black line = Tech Center average estimate • Based on career data from 619 resolved cases

Office Action

§101 §103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-6, 9-19 and 12-21 are pending.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-6, 9-19 and 12-21 are directed to an abstract idea without significantly more.
Regarding Claims 1, 9 and 10, the claims recite a method, device and non-transitory computer-readable medium for detecting a keyword, comprising: determining, for a target audio clip in a target audio, a first probability that a target audio frame in the target audio clip corresponds to a target character unit, wherein the first probability indicates a probability that the target audio frame is a voice frame of the target character unit, the target character unit is a character unit comprised in a preset keyword, and a position of the target audio frame in the target audio clip corresponds to a position of the target character unit in the preset keyword; determining, based on the first probability, a second probability that the target audio clip corresponds to the preset keyword, the second probability indicating a probability that respective audio frames in the target audio clip are sequentially respective character units in the preset keyword; and determining, based on the second probability, whether the target audio clip is a voice clip of the preset keyword.
The limitations of the claims, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting a "processor", nothing in the claim element precludes the step from practically being performed in the mind. Each of the limitations in the claim can be performed in the human mind including observation, evaluation and judgement. For example, the limitation for " determining, for a target audio clip in a target audio, a first probability that a target audio frame in the target audio clip corresponds to a target character unit” can be completed by a person listening to an audio clip and calculating a probability that the clip includes a target character. The limitation for determining “a position of the target audio frame in the target audio clip corresponds to a position of the target character unit in the preset keyword” can be completed by a person listening to the audio clip and writing down a sequence of characters of the words from the clip and determine which of the characters of the clip is a key character and whether there are consecutive characters in the clip that are key characters and where the characters appear in the sequence of characters. The limitation for “determining, based on the first probability, a second probability that the target audio clip corresponds to the preset keyword, the second probability indicating a probability that respective audio frames in the target audio clip are sequentially respective character units in the preset keyword” can be completed observing on paper the sequence of characters determined earlier and determining whether a number of consecutive characters are key characters of a keyword. The limitation for “determining, based on the second probability, whether the target audio clip is a voice clip of the preset keyword” can be completed by judging from the determined sequence of characters on paper whether a number of key characters are present for keyword and determining that the keyword is in the audio clip.
This judicial exception is not integrated into a practical application. In particular, the claims only recite one additional element - using a processor to perform the processing steps. The processor is recited at a high-level of generality such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Therefore, the claims are directed to an abstract idea.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional element of using a processor to perform the processing steps amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, claims are not patent eligible.
Regarding Claims 2-6, 11-19 and 12-21, the rationale provided for Claims 1, 9 and 10 is incorporated herein.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5-6, 9-12, 16-17, 20 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Gao et al. (US Pub. 2020/0357386 A1) in view of Washio et al. (US Pub. 2011/0218805 A1).
Regarding Claims 1, 9 and 10, Gao teaches a keyword detection method (see Fig.1 and paragraph [0025]), comprising:
determining, for a target audio clip in a target audio, a first probability that a target audio frame in the target audio clip corresponds to a target character unit (see Fig.2 (201,202,203) and paragraphs [0043-0045], calculating the posterior probability for each character and frame), wherein the first probability indicates a probability that the target audio frame is a voice frame of the target character unit (see Fig.2 (201,202,203) and paragraphs [0043-0045]), and the target character unit is a character unit comprised in a preset keyword (see Fig.2 (203) and paragraphs [0045-0046]);
determining, based on the first probability, a second probability that the target audio clip corresponds to the preset keyword (see Fig.2 (204), paragraphs [0050-0054] and paragraph [0058], determining confidence of the presence of a number of N key characters), the second probability indicating a probability that respective audio frames in the target audio clip are sequentially respective character units in the preset keyword (see Fig.2 (204), paragraphs [0050-0054] and paragraph [0058], determining confidence of the presence of a number of N consecutive key characters);
and determining, based on the second probability, whether the target audio clip is a voice clip of the preset keyword (see Fig.2 (204,205) and paragraph 0065]).
Gao fails to teach determining a position of the target audio frame in the target audio clip corresponds to a position of the target character unit in the preset keyword.
Washio, however, teaches determining the position and score of each audio frame in a speech sample containing a keyword (see Fig.3, Fig.4, paragraph [0036] and paragraph [0046], pointers indicating the positions of the audio frames).
It would have been obvious for one skilled in the art, before the effective filing date of the application, to include to Gao’s method the step for determining a position of the target audio frame in the target audio clip corresponds to a position of the target character unit in the preset keyword. The motivation would be to identify the portion of the audio clip or the consecutive frames that contains at least two key characters of a particular keyword.
Regarding Claims 2, 12 and 17. Gao further teaches determining an audio feature of the target audio frame (see Fig.2 (202) and paragraphs [0034-0035], extracting eigenvector of the character frame); and inputting the audio feature into a trained neural network model to obtain the first probability that the target audio frame corresponds to the target character unit (see Fig.3 and paragraphs [0050-0053], inputting the eigenvectors into the neural network model for the character determination).
Regarding Claims 5, 15 and 20. Gao further teaches determining if the second probability is greater than a preset threshold, determining that the target audio clip is a voice segment of the preset keyword (see Fig.2 (205) and paragraphs [0065-0066]).
Regarding Claims 6, 16 and 21. Gao further teaches wherein the character unit comprises a Chinese character (see Fig.3, paragraph [0020] and paragraph [0023]).

Claim Objections
Claims 3-4, 13-14 and 18-19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Regarding Claims 3, 13 and 18, the following is a statement of reasons for the indication of allowable subject matter: The prior art of record does teach, disclose or suggest the claimed limitation of (in combination with all other limitations in the claim) “wherein, the determining, based on the first probability, a second probability that the target audio clip corresponds to the preset keyword, comprises: determining a first probability that a target audio frame corresponds to a last target character unit in the preset keyword; determining a maximum value of confidences of a second-from-bottom target character unit in the preset keyword appearing in an audio frame before the target audio frame in the target audio clip; and determining the sum of the maximum value and the first probability that the target audio frame corresponds to the last target character unit in the preset keyword as the second probability; wherein the target audio frame is any audio frame in the target audio clip”. Similar features are claimed in Claims 4, 14 and 19.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VU B HANG whose telephone number is (571)272-0582.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hai Phan, can be reached at (571)272-6338. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/VU B HANG/Primary Examiner, Art Unit 2654

Read full office action

Prosecution Timeline

Jun 28, 2024

Application Filed

Jan 10, 2026

Non-Final Rejection — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/624,615

Patent 12603082

METHOD FOR TRAINING VOICE CONVERSION MODEL, ELECTRONIC DEVICE, AND STORAGE MEDIUM

2y 5m to grant Granted Apr 14, 2026

17/547,917

Patent 12592237

DRIVER INTERFACE WITH VOICE AND IMAGE CONTROL

2y 5m to grant Granted Mar 31, 2026

18/204,585

Patent 12555566

METHOD FOR TRAINING MACHINE READING COMPREHENSION MODEL, COMPUTER-READABLE RECORDING MEDIUM AND QUESTION ANSWERING SYSTEM

2y 5m to grant Granted Feb 17, 2026

18/236,302

Patent 12548562

SPEAKER DIARIZATION USING SPEAKER EMBEDDING(S) AND TRAINED GENERATIVE MODEL

2y 5m to grant Granted Feb 10, 2026

18/690,377

Patent 12542128

LEARNING APPARATUS, CONVERSION APPARATUS, LEARNING METHOD AND PROGRAM

2y 5m to grant Granted Feb 03, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

74%

Grant Probability

92%

With Interview (+17.5%)

2y 11m

Median Time to Grant

Low

PTA Risk

Based on 619 resolved cases by this examiner. Grant probability derived from career allow rate.