Last updated: April 19, 2026

Application No. 18/562,356

VOICE CONTROL METHOD AND APPARATUS, COMPUTER READABLE STORAGE MEDIUM, AND ELECTRONIC DEVICE

Non-Final OA §102§103

Filed

Nov 20, 2023

Examiner

CHAVEZ, RODRIGO A

Art Unit

2658

Tech Center

2600 — Communications

Assignee

BOE TECHNOLOGY GROUP CO., LTD.

OA Round

1 (Non-Final)

This examiner grants 50% of cases after interview

— +37.3% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.

Based on 228 resolved cases, 2023–2026

Examiner Intelligence

CHAVEZ, RODRIGO A View full profile →

Grants 50% of resolved cases

Career Allow Rate

115 granted / 228 resolved

-11.6% vs TC avg

Strong +37% interview lift

Without

With

+37.3%

Interview Lift

resolved cases with interview

Typical timeline

3y 5m

Avg Prosecution

22 currently pending

Career history

250

Total Applications

across all art units

Statute-Specific Performance

§101

16.4%

-23.6% vs TC avg

§103

53.1%

+13.1% vs TC avg

§102

20.9%

-19.1% vs TC avg

§112

5.6%

-34.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 228 resolved cases

Office Action

§102 §103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 05/14/2024 was filed. The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1, 10-13, 15, 16 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Jang (US PG Pub 20120316876).
	As per claims 1, 15 and 16, Jang discloses:
	A voice control method, applied in a display terminal, an electronic device and a computer non-transitory readable storage medium on which a computer program is stored (Jang; p. 0182 - the present invention may be recorded in a computer-readable recording medium as a program to be executed in the computer), wherein when the computer program is executed by a processor (Jang; p. 0182 - The program or the code segments may be stored in a processor-readable medium; see also p. 0073-0074 & Fig. 2, item 180), and a memory (Jang; Fig. 2, item 160; p. 0074 - The software codes may be stored in the memory 160), configured to store executable instructions for the processor, the voice control method is implemented, comprising: 	obtaining user voice information (Jang; Fig. 3, item 17; p. 0080 - The voice input unit 17 receives voice signals of a speaker. For example, the voice input unit 17 can correspond to a microphone), and creating a voice control relationship between the user and a target voice control window based on the user voice information (Jang; Fig. 6, item SI; Fig. 7, item SI; p. 0087-0092 - The display device 100 can display the voice recognition result on the display unit 151 by using an indicator related to at least one of the speaker, the voice input device, and the voice recognition result. The indicator related to the speaker is an indicator capable of identifying the speaker, which can include text, an image, a sound signal, a display setting value corresponding to a particular speaker, and a voice pattern of the particular speaker; see also p. 0105 -  though the speaker says "CH 10", the display device 100 displays the name (John) of the speaker S in the form of text on the display unit 151 and thus, the speaker S can know that his or her voice is recognized by the display device 100; see also p. 0106 - …the indicator displayed on the display unit 151 is an avatar for identifying the speaker S; accordingly, the speaker S can know that his or her voice is recognized by the display device 100… The item SI in both Fig. 6 & 7 are display windows that are used to identify the speaker who is using voice control, thus they represent a relationship between the user and the target voice control window on the display; see also p. 0121), wherein the target voice control window is one of multiple voice control windows displayed in the display terminal (Jang; Fig. 11, items S1 & S2; p. 0120 - with reference to FIG. 11, the controller 180 can recognize a first speaker S1 and a second speaker S2 respectively and display a first speaker indicator (SI1, a first avatar) corresponding to the first speaker S1 and a second speaker indicator (SI2, a second avatar) corresponding to the second speaker S2 on the display unit 151; see also p. 0121; see also Fig. 9; see also Fig. 19 & p. 0148); and 	converting the user voice information into a control instruction, and executing control content corresponding to the control instruction in the target voice control window (Jang; p. 0153 - For example, as shown in FIG. 20, it can be known that the first speaker S1 has set up a "Music" program as his or her favorite or high priority program. Therefore, according to a control method for a display device according to an embodiment of the present invention, the display device 100 can receive a speaker's voice through a voice input device, carry out voice recognition upon the received voice, recognize a speaker corresponding to the voice recognition result, and control itself to operate in the environment set up by the speaker… receiving and converting instructions for controlling a “Music” program).	 
	As per claim 10, Jang discloses:	The voice control method according to claim 1, wherein the obtaining user voice information comprises: obtaining original user voice information, and decoding the original user voice information to obtain user voice audio (Jang; p. 0118 - The memory 160 can store a reference voice pattern of each speaker. The reference voice pattern can be obtained through a repetitive voice input procedure. More specifically, the controller 180 can extract a feature vector from a voice signal generated by a speaker; calculates a probability value between the extracted feature vector and at least one speaker model pre-stored in a database; and carry out speaker identification determining whether the speaker is the one registered in the database based on the calculated probability value or speaker verification determining whether the speaker's access has been made in a proper way); and performing text recognition on the user voice audio to obtain the user voice information (Jang; p. 0100 - if a speaker S says "CH 10" toward the microphone of the remote control 10, the controller 180 can recognize the voice of the speaker S and display the recognition result on the display unit 151 in the form of text of "CH 10").  

	As per claim 11, Jang discloses:	The voice control method according to claim 1, wherein the control instruction comprises an execution action and execution content; and the executing the control content corresponding to the control instruction in the target voice control window, comprises: executing the execution content in the target voice control window based on the execution action (Jang; p. 0153 - For example, as shown in FIG. 20, it can be known that the first speaker S1 has set up a "Music" program as his or her favorite or high priority program. Therefore, according to a control method for a display device according to an embodiment of the present invention, the display device 100 can receive a speaker's voice through a voice input device, carry out voice recognition upon the received voice, recognize a speaker corresponding to the voice recognition result, and control itself to operate in the environment set up by the speaker… receiving and converting instructions for controlling a “Music” program).  

	As per claim 12, Jang discloses:	The voice control method according to claim 1, further comprising: in a case that the user voice information corresponding to the user is not obtained within a preset time period, displaying default content in the target voice control window (Jang; p. 0086 - At this time, the display device 100 can receive voice from the at least one speaker simultaneously or sequentially with a predetermined time interval. For example, when two speakers generate voice at the same time, the display device 100 can display a voice recognition error message on the display unit 151. Also, when voice is received sequentially, the display device 100 carries out voice recognition according to the order of the corresponding input sequence; on the other hand, when another voice is input while voice recognition is carried out upon particular voice, a voice recognition error message can be displayed on the display unit 151).  
	As per claim 13, Jang discloses:	The voice control method according to claim 1, wherein the user voice information comprises near-field voice information and/or far-field voice information (Jang; p. 0164 -  the second voice input device is a microphone embedded in the display device 100 (e.g., a smart TV) or a microphone array prepared near the smart TV; and reveals weak mobility and usually located at a relatively long distance from a speake (far-field information); p. 0171 - Accordingly, if a speaker uses the second voice input device, since signal strength of a voice signal is weak, the display device 100 can display an indicator 62 proposing re-inputting a voice by using the first voice input device on the display unit 151 (near-field information)).

	Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2-9 and 17-21 are rejected under 35 U.S.C. 103 as being unpatentable over Jang in view of Skidmore et al. (US Patent 9996535; hereinafter “Skidmore”).

	As per claims 2 and 17, Jang discloses:	The voice control method  and electronic device according to claims 1 and 15, wherein the creating the voice control relationship between the user and the target voice control window based on the user voice information, comprises: determining voice features corresponding to the user voice information (Jang; p. 0117 - in the case of multiple speakers S310, the controller 180 can recognize a voice pattern of a speaker received through a voice recognition device and carry out voice recognition according to the voice pattern S330), and determining a number N of users based on the voice features (Jang; p. 0120 – 0121 - with reference to FIG. 11, the controller 180 can recognize a first speaker S1 and a second speaker S2 respectively and display a first speaker indicator (SI1, a first avatar) corresponding to the first speaker S1 and a second speaker indicator (SI2, a second avatar) corresponding to the second speaker S2 on the display unit 151. As described above, the controller 180 can display a speaker indicator for identifying the first and the second speaker in addition to the first and the second avatar); and creating the voice control relationship between the N users and the N voice control windows, respectively (Jang; p. 0120 – 0121).	Jang, however, fails to disclose in a case that the number N of users is less than or equal to a preset number M, displaying N voice control windows in the display terminal. 	Skidmore does teach in a case that the number N of users is less than or equal to a preset number M, displaying N voice control windows in the display terminal (Skidmore; Col. 12, lines 9-58 - at block 620, organization service 410 accesses a display size of the user computing device to calculate an optimal and/or maximum number of items that can be displayed in the user interface. For example, the accessed display size includes a length and width in pixels, inches, millimeters, or in any other measurement unit. Continuing with the example, organization service 410 accesses a document and/or item representation size. The document and/or item representation size may include a length and width in pixels or any other measurement unit. In some embodiments, the document representation size may include spacing, in pixels or any other measurement unit, of a document from other documents and/or borders of a window. Continuing with the example, organization service 410 calculates the number of document representations, such as icons, that can fit within the display size of the user interface by dividing the display size by the accessed size and/or dimensions of the document representation. Thus, the document viewing threshold may be sixteen for a mobile computing device based on the display size of the mobile computing device and a size of a document in the user interface of the mobile computing device).	Therefore, it would have been obvious to one of ordinary skill in the art to modify the voice control method and electronic device of Jang, to include in a case that the number N of users is less than or equal to a preset number M, displaying N voice control windows in the display terminal, as taught by Skidmore, in order to efficiently browse items within any user interface, voice user interface, graphical user interface, page, video, electronic book and/or other electronic content (Skidmore; Col. 2, lines 63-67).

	As per claims 3 and 18, Jang in view of Skidmore discloses:	The voice control method and electronic device according to claims 2 and 17, upon which claims 3 and 18 depend.	And further, Skidmore teaches wherein the preset number M is determined based on a size of the display terminal or a target size corresponding to the display terminal (Skidmore; Col. 12, lines 9-58 - at block 620, organization service 410 accesses a display size of the user computing device to calculate an optimal and/or maximum number of items that can be displayed in the user interface. For example, the accessed display size includes a length and width in pixels, inches, millimeters, or in any other measurement unit. Continuing with the example, organization service 410 accesses a document and/or item representation size. The document and/or item representation size may include a length and width in pixels or any other measurement unit. In some embodiments, the document representation size may include spacing, in pixels or any other measurement unit, of a document from other documents and/or borders of a window. Continuing with the example, organization service 410 calculates the number of document representations, such as icons, that can fit within the display size of the user interface by dividing the display size by the accessed size and/or dimensions of the document representation. Thus, the document viewing threshold may be sixteen for a mobile computing device based on the display size of the mobile computing device and a size of a document in the user interface of the mobile computing device).	Therefore, it would have been obvious to one of ordinary skill in the art to modify the voice control method and electronic device of Jang, to include wherein the preset number M is determined based on a size of the display terminal or a target size corresponding to the display terminal, as taught by Skidmore, in order to efficiently browse items within any user interface, voice user interface, graphical user interface, page, video, electronic book and/or other electronic content (Skidmore; Col. 2, lines 63-67).  	
	As per claims 4 and 19, Jang in view of Skidmore disclose:	The voice control method and electronic device according to claims 2 and 17, further comprising: in a case that the number N of users is greater than the preset number M, selecting, from the N users, M target users according to a preset rule, wherein the preset rule comprises: detecting a distance between the user and the display terminal, and selecting the M target users from the N users according to the distance; or, selecting the M target users from the N users according to the voice features, wherein the voice features comprises volume; and creating the voice control relationship between the M target users and the M voice control windows, respectively (Jang; p. 0165 -  if signal strength of a voice signal (volume) received through the second voice input device is below a predetermined threshold value S622, can recommend the user for using the first voice input device and display an indicator for the recommendation on the display unit 151, S623).  	
	As per claims 5 and 20, Jang in view of Skidmore discloses:	The voice control method and electronic device according to claims 2 and 17, further comprising: in a case that the number N of users is less than or equal to the preset number M, obtaining relative position information of the users relative to the display terminal; and creating, according to the relative position information, the voice control relationship between the N users and the N voice control windows, respectively (Jang; Fig. 13 & 16; p. 0126-0139 - With reference to FIG. 13, in the case of multiple speakers S410, the controller 180 can recognize a speaker's location S420…; see also p. 0149-0150 - The display device 100 can carry out voice recognition upon the voice of the first speaker S1 and select a speaker icon corresponding to the first speaker S1 from multiple speaker icons 58 based on the voice recognition result…).  
	As per claims 6 and 21, Jang in view of Skidmore discloses:	The voice control method and electronic device according to claims 2 and 17, wherein the creating the voice control relationship between the user and the target voice control window based on the user voice information, comprises: displaying M voice control windows in the display terminal, and assigning a window identifier to each voice control window; in a case that the user voice information comprises information matching the window identifier, determining the target voice control window from the M voice control windows according to the user voice information; and creating the voice control relationship between the user corresponding to the user voice information and the target voice control window (Jang; p. 0149-0150 - According to the control method for a display device according to an embodiment of the present invention, a first speaker S1 transmits a predetermined voice to the display device 100 through a mobile terminal 20. The display device 100 can carry out voice recognition upon the voice of the first speaker S1 and select a speaker icon corresponding to the first speaker S1 from multiple speaker icons 58 based on the voice recognition result (voice information matching the window identifier)).  
	As per claim 7, Jang in view of Skidmore discloses:	The voice control method according to claim 6, wherein the information matching the window identifier comprises position information of the user; and the determining the target voice control window from the M voice control windows according to the user voice information, comprises: determining the target voice control window from the M voice control windows according to the position information (Jang; Fig. 13 & 16; p. 0126-0139 - With reference to FIG. 13, in the case of multiple speakers S410, the controller 180 can recognize a speaker's location S420…; see also p. 0149-0150 - The display device 100 can carry out voice recognition upon the voice of the first speaker S1 and select a speaker icon corresponding to the first speaker S1 from multiple speaker icons 58 based on the voice recognition result… if there is an existing icon/avatar on the display corresponding to the current speaker, the speaker is assigned to the matching icon/avatar).  
	As per claim 8, Jang in view of Skidmore discloses:	The voice control method according to claim 6, further comprising: in a case that the user voice information does not comprise information matching the window identifier, obtaining relative position information of the user relative to the display terminal; and creating the voice control relationship between the user corresponding to the user voice information and the target voice control window according to the relative position information (Jang; Fig. 13 & 16; p. 0126-0139 - With reference to FIG. 13, in the case of multiple speakers S410, the controller 180 can recognize a speaker's location S420, recognize the speaker as the speaker's location is recognized, and change a pointing direction of a speaker indicator according to the speaker's location recognized S440… if there is no existing icon/avatar on the display corresponding to the user who is speaking, one is created and placed on the display relative to the user’s position). 

	As per claim 9, Jang in view of Skidmore disclose:	The voice control method according to claim 2, wherein the creating the voice control relationship between the user and the target voice control window based on the user voice information, comprises: displaying M voice control windows in the display terminal (Jang; Fig. 11, items S1 & S2; p. 0120 - with reference to FIG. 11, the controller 180 can recognize a first speaker S1 and a second speaker S2 respectively and display a first speaker indicator (SI1, a first avatar) corresponding to the first speaker S1 and a second speaker indicator (SI2, a second avatar) corresponding to the second speaker S2 on the display unit 151; see also p. 0121; see also Fig. 9; see also Fig. 19 & p. 0148); determining preset voiceprint information corresponding to the M voice control windows respectively (Jang; p. 0117-0118 - in the case of multiple speakers S310, the controller 180 can recognize a voice pattern of a speaker received through a voice recognition device and carry out voice recognition according to the voice pattern S330); performing voiceprint recognition on the user voice information to obtain user voiceprint information, and in a case that the user voiceprint information matching preset voiceprint information, determining the voice control window corresponding to the preset voiceprint information as the target voice control window (Jang; p. 0119 - The controller 180 can display a speaker indicator on the display unit 151 based on a voice recognition result); and creating the voice control relationship between the user corresponding to the user voiceprint information and the target voice control window (Jang; p. 0121 - As described above, the controller 180 can display a speaker indicator for identifying the first and the second speaker in addition to the first and the second avatar. For example, while the first avatar is displayed for identifying a first speaker, to identify the second speaker, an input device indicator corresponding to a voice input device used by the second speaker can be displayed along with the first avatar. Accordingly, each speaker can know that his or her voice is recognized by the display device 100).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. The prior art made of record and not relied upon includes:
	Chang (US PG Pub 20220368742) discloses user interfaces for managing shared-content sessions. In some embodiments, content is shared with a group of users participating in a shared-content session. In some embodiments, the content is screen-share content that is shared from one device to other participants of the shared-content session. In some embodiments, the content is synchronized content for which output of the content is synchronized across the participants of the shared-content session (Chang; Abstract).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Rodrigo A Chavez whose telephone number is (571)270-0139. The examiner can normally be reached Monday - Friday 9-6 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached at 5712727602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/RODRIGO A CHAVEZ/Examiner, Art Unit 2658                                                                                                                                                                                                        



/RICHEMOND DORVIL/Supervisory Patent Examiner, Art Unit 2658

Read full office action

Prosecution Timeline

Nov 20, 2023

Application Filed

Jan 09, 2026

Non-Final Rejection — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/175,355

Patent 12597430

MULTI-CHANNEL SIGNAL GENERATOR, AUDIO ENCODER AND RELATED METHODS RELYING ON A MIXING NOISE SIGNAL

2y 5m to grant Granted Apr 07, 2026

17/579,750

Patent 12579984

DATA AUGMENTATION SYSTEM AND METHOD FOR MULTI-MICROPHONE SYSTEMS

2y 5m to grant Granted Mar 17, 2026

17/513,419

Patent 12541653

ENTERPRISE COGNITIVE SOLUTIONS LOCK-IN AVOIDANCE

2y 5m to grant Granted Feb 03, 2026

17/532,315

Patent 12542136

DYNAMICALLY CONFIGURING A WARM WORD BUTTON WITH ASSISTANT COMMANDS

2y 5m to grant Granted Feb 03, 2026

17/450,015

Patent 12531077

METHOD AND APPARATUS IN AUDIO PROCESSING

2y 5m to grant Granted Jan 20, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

50%

Grant Probability

88%

With Interview (+37.3%)

3y 5m

Median Time to Grant

Low

PTA Risk

Based on 228 resolved cases by this examiner. Grant probability derived from career allow rate.