Last updated: April 19, 2026

Application No. 18/587,771

CONTROL APPARATUS FOR CAUSING SPEAKER TO REPRODUCE SOUND CORRESPONDING TO LISTENING POINT, CONTROL METHOD, AND NON-TRANSITORY COMPUTER READABLE MEDIUM

Final Rejection §103

Filed

Feb 26, 2024

Examiner

ZHU, QIN

Art Unit

2691

Tech Center

2600 — Communications

Assignee

Canon Kabushiki Kaisha

OA Round

2 (Final)

Interview Optional

— +2.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 610 resolved cases, 2023–2026

Examiner Intelligence

ZHU, QIN View full profile →

Grants 88% — above average

Career Allow Rate

534 granted / 610 resolved

+25.5% vs TC avg

Minimal +3% lift

Without

With

+2.6%

Interview Lift

resolved cases with interview

Fast prosecutor

2y 1m

Avg Prosecution

29 currently pending

Career history

639

Total Applications

across all art units

Statute-Specific Performance

§101

3.8%

-36.2% vs TC avg

§103

42.0%

+2.0% vs TC avg

§102

20.9%

-19.1% vs TC avg

§112

16.3%

-23.7% vs TC avg

Black line = Tech Center average estimate • Based on career data from 610 resolved cases

Office Action

§103

DETAILED ACTION
This action is in response to communications filed 2/26/2026:
Claims 1-13 are pending
Claims 12- 13 are added


Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments with respect to claim(s) 1-13 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.


Response to Amendment
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-3, 5-6, and 9-11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kumagai et al (JP2009044261, translated by EPO, hereinafter “Kumagai”) in view of Lang et al (US20240284137, hereinafter “Lang”).
Regarding claim 1, Kumagai teaches a control apparatus (¶1, apparatus) comprising:
one or more memories storing instructions (¶20, memory 65); and
one or more processors (¶20, DSP) executing the instructions to:
obtain a first position of a listening point (¶13, generating optimal audio requiring a listening position);
calculate a respective position of each of one or more virtual sound sources with respect to the first position of the listening point (¶13, localizing a plurality of virtual sound sources around the determined listening position);
calculate positions of a plurality of speakers with respect to the first position of the listening point, the plurality of speakers being located around the listening point (¶13, determining position information of the plurality of speakers with respect to the listening position and also positioning speakers around the listening environment (including the user) (see Fig. 3));
generate a respective output signal to be output to each of the plurality of speakers based on one or more sound source signals each output from the one or more virtual sound sources, the respective position of each of the one or more virtual sound sources with respect to the first position of the listening point, and the positions of the plurality of speakers with respect to the first position of the listening point (¶14, outputting plurality of audio signals to each respective speaker based on the position of the user, sound output position, and position of the speakers);
cause each of the plurality of speakers to reproduce sound corresponding to the respective output signal (Fig. 7, outputting audio to be reproduced);
Kumagai fails to explicitly teach detect movement of the listening point to a second position;
calculate (i) the respective position of each of the one or more virtual sound sources with respect to the second position of the listening point and (ii) the positions of the plurality of speakers with respect to the second position of the listening point; and 
regenerate the output signal to be output to each of the plurality of speakers based on (i) the calculated respective position of each of the one or more virtual sound sources with respect to the second position of the listening point and (ii) the calculated positions of the plurality of speakers with respect to the second position of the listening point.
Lang teaches detect movement of the listening point to a second position (¶46, user’s positions are track in real-time to update the output audio);
calculate (i) the respective position of each of the one or more virtual sound sources with respect to the second position of the listening point and (ii) the positions of the plurality of speakers with respect to the second position of the listening point (¶50-53, different user location may affect virtual playback format which defines virtual speaker positions (and thus virtual source positions)); and 
regenerate the output signal to be output to each of the plurality of speakers based on (i) the calculated respective position of each of the one or more virtual sound sources with respect to the second position of the listening point and (ii) the calculated positions of the plurality of speakers with respect to the second position of the listening point (¶46, 50-53, user position is tracked in real-time to cause an update on the playback of the rendered audio wherein the update on the playback is further affected by the user’s current position and virtual source/speaker positions).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply the method of tracking a user’s positioning (as taught by Lang) to the audio reproduction system (as taught by Kumagai). The rationale to do so is to apply a known technique to a known device ready for improvement to yield the predictable result of updating audio output in accordance with a user’s position in order to realize better localized audio output (Lang, ¶2).
Regarding claim 2, Kumagai in view of Lang teaches wherein the one or more processors further execute the instructions to generate the respective output signal to be output to each of the plurality of speakers by distributing the one or more sound source signals to the plurality of speakers based on the respective position of each of the one or more virtual sound sources with respect to the first position of the listening point and the positions of the plurality of speakers with respect to the first position of the listening point (Kumagai, ¶69, distributing sound to one or more speakers on the basis of speaker position, virtual sound position, and receiving point/user position).
Regarding claim 3, Kumagai in view of Lang teaches wherein the one or more processors further execute the instructions to select a set of speakers to which a sound source signal, of the one or more sound source signals, is distributed, from among the plurality of speakers, based on the respective position of each of the one or more virtual sound sources with respect to the first position of the listening point and the positions of the plurality of speakers with respect to the first position of the listening point, and to distribute the sound source signal to the selected set of speakers (Kumagai, ¶69, selecting a subset of speakers from the plurality of speakers (e.g. 4 out of the 8 shown in Fig. 3) to output audio from and to apply a distribution of audio to the selected speakers).
Regarding claim 5, Kumagai in view of Lang teaches wherein the one or more processors further execute the instructions to correct the respective output signal to be output to each of the plurality of speakers based on a distance between the listening point and each of the plurality of speakers (Kumagai, ¶30, 49, applying one or more correction to the audio output based on at least a distance parameter).
Regarding claim 6, Kumagai in view of Lang teaches wherein the one or more processors further execute the instructions to perform correction processing related to at least one of a sound pressure and a delay (¶76, sense of distance can be altered by adjusting a delay and amplitude of the output signal).
Regarding claim 9, Kumagai in view of Lang teaches further comprising the plurality of speakers (Fig. 3, plurality of speakers).
Regarding claim 10, it is rejected similarly as claim 1. The method can be found in Kumagai (¶18, method of localizing audio).
Regarding claim 11, it is rejected similarly as claim 1. The medium can be found in Kumagai (Fig. 3, CPU requiring computing instructions and to be stored on a form of medium).
Regarding claim 12, Kumagai in view of Lang teaches wherein the one or more processors further execute the instructions to: 
transform coordinates of each of the one or more virtual sound sources described in a virtual world coordinate system into coordinates in an actual world coordinate system; and 
calculate the respective position of each of the one or more virtual sound sources with respect to the first position of the listening point using the transformed coordinates in the actual world coordinate system (Lang, ¶72-73, coordinates can be provided to the user based on their location and virtual playback format may also provide coordinates that define where the virtual speaker is to be placed in the environment (and therefore also the virtual sound sources)).

Claim(s) 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kumagai et al (JP2009044261, translated by EPO, hereinafter “Kumagai”) in view of Lang et al (US20240284137, hereinafter “Lang”) in further view of Nelson et al (US20040170281, hereinafter “Nelson”).
Regarding claim 4, Kumagai in view of Lang fail to explicitly teach wherein the one or more processors further execute the instructions to:
create a set of adjacent speakers as viewed from the first position of the listening point;
calculate an inverse matrix using a vector indicating a direction of each of the plurality of speakers as viewed from the position of the listening point for each set of the speakers; and
generate the respective output signal to be output to each of the plurality of speakers by determining a set of the speakers to which a sound source signal, of the one or more sound source signals, is distributed based on the inverse matrix and the respective position of each of the one or more virtual sound sources with respect to the first position of the listening point.
Nelson teaches wherein the one or more processors further execute the instructions to:
create a set of adjacent speakers as viewed from the first position of the listening point (Fig. 1a, set of speakers as viewed from the listening position);
calculate an inverse matrix using a vector indicating a direction of each of the plurality of speakers as viewed from the position of the listening point for each set of the speakers (Fig. 1a, ¶61, creating a set of inverse filters (matrix of inverse filters) as viewed from the listening position); and
generate the respective output signal to be output to each of the plurality of speakers by determining a set of the speakers to which a sound source signal, of the one or more sound source signals, is distributed based on the inverse matrix and the respective position of each of the one or more virtual sound sources with respect to the first position of the listening point (Fig. 1a, ¶61, using the determined matrix of inverse filters to be applied to the output signals).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply the audio distribution method (as taught by Nelson) to the audio reproduction system (as taught by Kumagai in view of Lang). The rationale to do so is to apply a known technique to a known device in the same way to achieve the result of achieving excellent center images (Nelson, ¶74).

Claim(s) 7-8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kumagai et al (JP2009044261, translated by EPO, hereinafter “Kumagai”) in view of Lang et al (US20240284137, hereinafter “Lang”) in further view of Marten (US20210105563).
Regarding claim 7, Kumagai in view of Lang fail to explicitly teach wherein the one or more processors further execute the instructions to:
generate a three-dimensional video image of a virtual space as viewed from the first position of the listening point; and
display the generated three-dimensional video image of the virtual space within a space in which the plurality of speakers is located.
Marten teaches wherein the one or more processors further execute the instructions to:
generate a three-dimensional video image of a virtual space as viewed from the first position of the listening point; and
display the generated three-dimensional video image of the virtual space within a space in which the plurality of speakers is located (¶60, Fig. 4, generating a 3D representation of the listening environment including the plurality of speaker positions).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the audio configuration technique (as taught by Marten) with the audio reproduction system (as taught by Kumagai in view of Lang). The rationale to do so is to combine prior art elements in a predictable manner to achieve the predictable result of modifying and optimizing audio output with the aid of a graphic representation of the reproduction system and its environment (Marten, ¶4).
Regarding claim 8, Kumagai in view of Lang in further view of Marten teaches wherein the one or more processors further execute the instructions to adjust the respective output signal to be output to each of the plurality of speakers based on a shielding situation where the one or more virtual sound sources are shielded by a shield in the generated three-dimensional video image of the virtual space (Marten, ¶45, optimizing audio output with respect to the placement/existence of furniture (or objects that are known to have a physical effect on the audio output); Kumagai, ¶49, calibrating audio output based on a test sound).

Claim(s) 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kumagai et al (JP2009044261, translated by EPO, hereinafter “Kumagai”) in view of Lang et al (US20240284137, hereinafter “Lang”) in further view of Kimura et al (US20240340605, hereinafter “Kimura”).
Regarding claim 13, Kumagai in view of Lang fail to explicitly teach wherein the respective position of each of the one or more virtual sound sources with respect to the first position of the listening point and the positions of the plurality of speakers with respect to the first position of the listening point are each represented as coordinates in a three-dimensional coordinate system having the first position of the listening point as an origin and having axis directions that match axis directions of an actual world coordinate system.
Kimura teaches wherein the respective position of each of the one or more virtual sound sources with respect to the first position of the listening point and the positions of the plurality of speakers with respect to the first position of the listening point are each represented as coordinates in a three-dimensional coordinate system having the first position of the listening point as an origin and having axis directions that match axis directions of an actual world coordinate system (Fig. 6, 19, ¶150, the user is placed at an origin of a 3D coordinate system wherein one or more sound sources can be placed around the user in said coordinate system – the coordinate system further comprising of axis directions (XYZ)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply the sound localization technique (as taught by Kimura) to the audio reproduction system (as taught by Kumagai in view of Lang). The rationale to do so is to apply a known technique to a known system ready for improvement to yield the predictable result of simulating a plurality of sounds around the user while also improving speech intelligibility (sound source separation distancing) (Kimura, ¶43).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Refer to PTO-892, Notice of References Cited for a listing of analogous art.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to QIN ZHU whose telephone number is (571)270-1304.  The examiner can normally be reached on Monday-Thursday 6AM-4PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached on 571-272-7503.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/QIN ZHU/Primary Examiner, Art Unit 2691

Read full office action

Prosecution Timeline

Feb 26, 2024

Application Filed

Nov 25, 2025

Non-Final Rejection — §103

Feb 26, 2026

Response Filed

Mar 16, 2026

Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/366,824

Patent 12604125

DETECTING ACTIVE SPEAKERS USING HEAD DETECTION

2y 5m to grant Granted Apr 14, 2026

18/664,733

Patent 12603076

NOISE CONTROL SYSTEM, NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM INCLUDING A PROGRAM, AND NOISE CONTROL METHOD

2y 5m to grant Granted Apr 14, 2026

18/225,558

Patent 12597900

METHOD AND APPARATUS TO EVALUATE AUDIO EQUIPMENT FOR DYNAMIC DISTORTIONS AND OR DIFFERENTIAL PHASE AND OR FREQUENCY MODULATION EFFECTS

2y 5m to grant Granted Apr 07, 2026

18/509,860

Patent 12593169

DIRECTION-BASED FILTERING FOR AUDIO DEVICES USING TWO MICROPHONES

2y 5m to grant Granted Mar 31, 2026

18/335,989

Patent 12587805

SOUND-FIELD CONTROL METHOD AND DEVICE, ELECTRONIC DEVICE AND COMPUTER-READABLE STORAGE MEDIUM

2y 5m to grant Granted Mar 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

88%

Grant Probability

90%

With Interview (+2.6%)

2y 1m

Median Time to Grant

Moderate

PTA Risk

Based on 610 resolved cases by this examiner. Grant probability derived from career allow rate.