Last updated: April 19, 2026

Application No. 18/535,011

INFORMATION PROCESSING METHOD, APPARATUS AND COMPUTER PROGRAM

Final Rejection §103

Filed

Dec 11, 2023

Examiner

DORVIL, RICHEMOND

Art Unit

2658

Tech Center

2600 — Communications

Assignee

Sony Interactive Entertainment Inc.

OA Round

2 (Final)

Interview Optional

— +25.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 49 resolved cases, 2023–2026

Examiner Intelligence

DORVIL, RICHEMOND View full profile →

Grants only 22% of cases

Career Allow Rate

11 granted / 49 resolved

-39.6% vs TC avg

Strong +26% interview lift

Without

With

+25.6%

Interview Lift

resolved cases with interview

Typical timeline

3y 0m

Avg Prosecution

12 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

16.4%

-23.6% vs TC avg

§103

46.3%

+6.3% vs TC avg

§102

14.4%

-25.6% vs TC avg

§112

17.0%

-23.0% vs TC avg

Black line = Tech Center average estimate • Based on career data from 49 resolved cases

Office Action

§103

DETAILED ACTION
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
Drawings
The drawings were received on 10/27/2025.  These drawings are acceptable.

Response to Arguments
Applicant’s arguments with respect to claim(s) 1, 2, 5, 6, 9, 10, 11, 12, 13 and 14 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. The rejection can be overcome if the independent claims clearly recite the mistake or errors in the  audio content are explicitly limited to linguistic and/or sematic errors. The Brimijoin and Seefeldt references relate to error/mistake in the audio content based on perceptual measures (excitation, specific loudness) and comparing measured value to desired target values. 

Claim Objections
Claims 1 – 14 are objected to because of the following informalities: claim 1 preamble recite “first audio content” and at line 4 recites “a first audio content”.  This could be confusing. Appropriate correction is required.
Claims 13 and 14 have similar issues.
Claims 2-12 incorporate the deficiencies of claim 1.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 2, 5, 6, 9, 10, 11, 12, 13 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Brimijoin, II et al, US Patent Number 11,245,984 (hereinafter Brimijoin) in view of Seefeldt, US Publication 20070291959.
As per claims 1, 13 and 14, Brimijoin discloses an information processing method and system for generating corrected audio content comprising:
	Receiving a first audio content generated by a user and captured at audio receiving device, (see Fig 2, item 210 “Microphone array” and Fig 4 item 410 and ¶0073, “the audio system captures sound using one or more microphones coupled to a frame of a headset worn by a user”);
	Identifying a target portion of the first audio content having a predetermined characteristic (see ¶55, “The DOA estimation module 240 is configured to localize sound sources in the local area based in part on captured sound from the microphone array”; ¶61, “ “The source identification module 255 is configured to determine a sound source (e.g., a target sound source) of the plurality of sound sources in the local area that is of interest to the user at any given time.” ; Fig 4 items 420 and 430 and ¶75, “the audio system determines the target sound source based on implicit user input stored in the model of the local area and/or express user input provided directly by the user. For example, the model of the local area may include a mapped gaze vector (i.e., an implicit user input) that is utilized by the audio system in the determination of the target sound source. The mapped gave vector may have been determined by an eye tracking system of the headset”) – This limitation can be interpreted and read as: “determining a target sound source of the one or more sound sources”-;
	selecting correction processing to be performed on the target portion of the first audio content in accordance with the predetermined characteristic of the target portion, (see ¶62 and ¶63, “The sound filter module 260 determines one or more filters to apply to one or more sound signals. The sound signals may correspond to sound emitted by the target sound source … In some embodiments, the one or more sound filters may cause the sound signal associated with the target sound source to be enhanced” and “The sound filter module 260 may determine one or more filters based on the sound profile of the user. For example, the sound filter module 260 may select a filter that amplifies certain frequencies based on the sound profile of the user…”. Fig 4 items 440, 450 and ¶0076, “…The sound profile may be stored in the audio system. Based on the sound profile, the audio system may determine to apply a filter that enhances the sound signal associated with the target sound source …”) – This limitation is interpreted as determining one or more filters to apply to a sound signal associated with the target source in the captured sound-;
	generating corrected ( augmented sound signal) audio content in which the target portion of the first audio content has been corrected by performing the selected correction processing on the target portion of the first audio content (see ¶66, The sound filter module 260 may apply the one or more filters to the sound signal to generate the augmented sound signal. In some embodiments, the augmented sound signal may be provided to the transducer array 320 for presentation to the user”  Fig 4 items 450 and 460 and ¶0077 “The audio system generates 450 an augmented sound signal by applying the one or more filters to the sound signal. In one embodiment, the augmented sound signal is such that sound appearing to originate from the target sound source is distinguishable from sound emitted by other sound sources.” and ¶0078). Interpreted as generated an augmented sound source signal by applying one or more filters to the sound signal -;
	providing the corrected (or augmented) audio content at an apparatus as instructions for the apparatus to perform an operation in accordance with the corrected (augmented) audio content, (see ¶3, “The audio controller further generates an augmented sound signal by applying the one or more filters to the sound signal and provides the augmented sound signal to the in‑ear device for presentation to a user.” ¶78,  “The audio system provides 460 the augmented sound signal to a speaker assembly for presentation to a user. In one embodiment, the augmented sound signal is provided to a speaker assembly of an in‑ear device worn by the user.”   ¶54 to 54, “The calibration module 237 may update the sound profile of the user as needed over time. For example, during operation of the audio system 200, the calibration module may receive feedback from the user related to the performance of the headset and/or the in‑ear devices 270. … Based on the received feedback, the calibration module 237 may update the sound profile of the user accordingly.” Interpreted as -providing the augmented sound signal to a speaker assembly for presentation to a user …. updating the sound profile of the user based on user feedback -).
	Brimijoin fails to explicitly teach identifying a target portion … indicative of an error or mistake in the audio content and perform an operation in accordance with the corrected  audio content. See claim 1 line 7 and line 14.
However these features are well known in the art as evidenced by Seefeldt which discloses a system for the analysis of incoming audio into time frequency bands to reduce difference between measured and target audio content to produce modified audio, comprising: identifying a target portion … indicative of an error or mistake in the audio content (see Fig 1 and  ¶79, “(1) a signal path having a process or device 2 ("Modify Audio Signal") capable of modifying the audio in response to modification parameters”; ¶101,  the unmodified audio signal and either (1) the modification parameters or (2) the target specific loudness or a representation of the target specific loudness (e.g., scale factors usable in calculating, explicitly or implicitly, target specific loudness) may be stored or transmitted for use, for example, in a temporally and/or spatially separated device or process. The modification parameters, target specific loudness…”
And providing the corrected audio content at an apparatus as instructions to perform an operation in accordance with the corrected audio content, ( See ¶58,  “the unmodified audio signal and either (1) the modification parameters or (2) the target specific loudness … may be stored or transmitted … The played‑back or received modification parameters may then be applied to a Modify Audio Signal 2 … to modify the played‑back or received audio signal …”and   ¶101, “explicit storage/transmit/receive arrangements for parameters or modified audio and downstream application at a decoder/player.”.
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to produce a modified audio of a received audio signal indicative of an error or mistake in the audio content based on corrected or modification parameters with the Brimijoin system as taught by Seefeldt for the purpose of improving the augmented audio improving perceived audio quality and intelligibility of audio signals for a listener.
As per claim 2, Brimijoin and Seefeldt disclose all the limitations of claim 1 upon which claim 2 depend. Brimijoin further discloses an information processing method wherein the predetermined characteristic of the first audio content comprises at least one of: a repeated audio, a predetermined audio content, and audio content having a tempo below a predetermined threshold value (see col. 3, lines 1 -2 “the audio content may include re-broadcast captured sound …” /repeated audio; col. 221, line 20 “individualized and enhanced audio content”). Seefeldt also disclose an audio content having a tempo below a predetermined threshold value (see ¶101). 
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to modify Brimijoin system with Seefeldt to improve the augmented audio and improve perceived audio quality and intelligibility of audio signals for a listener.
As per claim 5, Brimijoin and Seefeldt disclose all the limitations of claim 1 upon which claim 5 depends. Brimijoin further discloses wherein identifying the target portion of the first audio content having the predetermined characteristic comprises comparison of the first audio content with calibration data provided by a user (see Fig 2, item 237 and col. 14, lines 5 – 35, “The calibration module 237 generates a sound profile of the user. The sound profile is personalized sound information about the user describing how well a user hears sounds at different frequencies. The sound profile may include information from one or more audiograms, loudness discomfort levels test results, speech-in-noise test results,”).
As per claim 6, Brimijoin and Seefeldt disclose all the limitations of claim 1 upon which claim 6 depends. Brimijoin further discloses wherein 
acquiring first image data of the user corresponding to the first audio content from an image capture device (see fig 4 item 420 and col. 20, lines 16 – 21); and 
identifying the target portion of the first audio content having the predetermined characteristic in accordance with the first image data which has been acquired (see fig 4 item 420 and col. 20, lines 16 – 21).
As per claim 9, Brimijoin and Seefeldt disclose all the limitations of claim 1 upon which claim 9 depends. Brimijoin further discloses wherein the method comprises selecting the correction processing by comparing the predetermined characteristic of the target portion with a look-up table associating predetermined characteristics of audio content with correction processing (col. 16, lines 7 – 13).
As per claim 10, Brimijoin and Seefeldt disclose all the limitations of claim 1 upon which claim 10 depends. Brimijoin further discloses wherein the method further comprises performing a control operation in accordance with the corrected audio content which has been generated (see col. 7, lines 29 – 37; col. 23, lines 36 - 51).
As per claim 11, Brimijoin and Seefeldt disclose all the limitations of claim 10 upon which claim 11 depends. Brimijoin further discloses wherein the control operation includes one or more of: storing and/or transmitting the corrected audio content (see col. 23, lines 36 – 51).
As per claim 12, Brimijoin and Seefeldt disclose all the limitations of claim 10 upon which claim 11 depends. Brimijoin further discloses wherein an avatar of a user is displayed and performing the control operation comprises controlling an appearance of the avatar of the user in accordance with the corrected audio content (see col. 7, lines 29 – 36, “the PCA may capture images of the user. The images captured by the PCA of the user may be used to update the model of the local area with gestures performed by the user. A gesture is any movement performed by the user that is indicative to a command (i.e., an implicit user input). A gesture performed by the user may include, e.g., a pointing gesture with the user's hand(s), finger(s), arm(s), some other movement performed by the user indicative of a command, or some combination thereof.”).
As per claim 13, claim 13 is similar in scope a content to claim 1 rejected above and therefore claim 13 in rejected under the same rationale (Brimijoin discloses an apparatus for performing the steps of claim 1, see the audio system of Fig 2).
As per claim 14, claim 14 is similar in scope and content to claim 1 and 13 rejected above, and therefore is rejected under the same rationale (see col. 25, lines 44 -51).

Claim(s) 3, 4, 7 and 8 are rejected under 35 U.S.C. 103 as being unpatentable over Brimijoin and Seefeldt in view of Weng et al., US Patent Number 7,930,168.
As per claim 3, Brimijoin and Seefeldt discloses all the limitations of claim 1 upon which claim 3 depends. Brimijoin fails to explicitly teach an information processing method wherein the audio content is at least one of: a word, a syllable, a consonant. and/or a vowel.
However, this feature is well known in the art as evidenced by Weng et al which discloses, in the same field of endeavor, a system/method for processing disfluent speech wherein the audio content is at least one of a word, a syllable, a consonant and/or a vowel (see col. 2, lines 24 – 43 and Fig. 1, “The system interprets spoken word inputs such as sentence fragment 102”).
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to modify Brimijoin and Seefeldt system with the Weng et al. system to generate augmented/corrected audio that computer would better understand without the “ummm” stutter, silence and/or incorrect grammar (Weng et al. col. 1 lines 24-36).
As per claim 4, Brimijoin and Seefeldt disclose all the limitations of claim 1 upon which claim 4 depends. Brimijoin further discloses identifying the target portion of the first audio content having the predetermined characteristic by analyzing a waveform of the first audio contend (see col. 2, lines 7 – 14, “determining one or more filters to apply to a sound signal associated with the target sound source in the captured sound”, the filters are applied by inherently analyze the sound waveform).
 Brimijoin fails to explicitly teach an information processing method wherein the method comprises (alternatively) converting the first audio content into text and analyzing the text to identify the target portion of the first audio content having the predetermined characteristic; or wherein identifying the target portion of the first audio content having the predetermined characteristic comprises use of a trained model.
However, these features are well known in the art as evidenced by Weng et al which discloses, in the same field of endeavor, a system/method for processing disfluent speech wherein the method comprises converting the first audio content into text and analyzing the text to identify the target portion of the first audio content having the predetermined characteristic (see col. 2, lines 34 – 36, “Speech recognition unit 110 transcribes the sounds of human speech into text data. This text is then sent to part-of-speech tagger 112 which labels each text word with a part-of-speech (POS) tag such as "noun", "verb", etc. The text, now annotated with POS tags, is input to a disfluence identifier 114”, or
wherein identifying the target portion of the first audio content having the predetermined characteristic comprises use of a trained model (see Fig 3 and col. 3, lines 30 – 35).
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to modify Brimijoin and Seefeldt system with Weng et al. system to generate augmented/corrected audio that computer would better understand without the “ummm” stutter, silence and/or incorrect grammar (Weng et al. col. 1 lines 24-36).
As per claim 7, Brimijoin and Seefeldt discloses all the limitations of claim 1 upon which claim 4 depends. Brimijoin fails to explicitly teach wherein the correction processing to be performed comprises at least one of: removal of at least the target portion of the first audio content, replacement of at least the target portion of the first audio content, and/or adaptation of the tempo of at least the target portion of the first audio content.
However, these features are well known in the art as evidenced by Weng et al which discloses, in the same field of endeavor, wherein the correction processing to be performed comprises at least one of: removal of at least the target portion of the first audio content, replacement of at least the target portion of the first audio content, and/or adaptation of the tempo of at least the target portion of the first audio content (see col. 3, lines 30 – 54).
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to modify Brimijoin and Seefeldt system with Weng et al. system to generate augmented/corrected audio that computer would better understand without the “ummm” stutter, silence and/or incorrect grammar (Weng et al. col. 1 lines 24-36).
As per claim 8, Brimijoin and Seefeldt discloses all the limitations of claim 7 upon which claim 8 depends. Brimijoin further discloses wherein the method comprises replacing the target portion of the first audio content with synthesized audio content and/or pre-recorded audio content (see Fig 4, item 460).
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RICHEMOND DORVIL whose telephone number is (571)272-7602. The examiner can normally be reached 8:30 - 5:30 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/RICHEMOND DORVIL/Supervisory Patent Examiner, Art Unit 2658

Read full office action

Prosecution Timeline

Dec 11, 2023

Application Filed

Jul 23, 2025

Non-Final Rejection — §103

Sep 23, 2025

Interview Requested

Oct 07, 2025

Applicant Interview (Telephonic)

Oct 08, 2025

Examiner Interview Summary

Oct 27, 2025

Response Filed

Mar 13, 2026

Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/173,402

Patent 12591738

Autocorrect Candidate Selection

2y 5m to grant Granted Mar 31, 2026

18/461,095

Patent 12573397

ELECTRONIC APPARATUS AND CONTROLLING METHOD THEREOF

2y 5m to grant Granted Mar 10, 2026

18/301,064

Patent 12567401

EVALUATING RELIABILITY OF AUDIO DATA FOR USE IN SPEECH PROCESSING

2y 5m to grant Granted Mar 03, 2026

18/447,506

Patent 12547849

ABSTRACTIVE SUMMARIZATION OF INFORMATION TECHNOLOGY ISSUES USING A METHOD OF GENERATING COMPARATIVES

2y 5m to grant Granted Feb 10, 2026

18/005,801

Patent 12505853

SIGNAL PROCESSING DEVICE AND METHOD

2y 5m to grant Granted Dec 23, 2025

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

22%

Grant Probability

48%

With Interview (+25.6%)

3y 0m

Median Time to Grant

Moderate

PTA Risk

Based on 49 resolved cases by this examiner. Grant probability derived from career allow rate.