Last updated: April 19, 2026

Application No. 18/784,475

LOW-QUALITY AUDIO DETECTION

Non-Final OA §102§103§112

Filed

Jul 25, 2024

Examiner

ELAHEE, MD S

Art Unit

2694

Tech Center

2600 — Communications

Assignee

Zoom Video Communications, Inc.

OA Round

1 (Non-Final)

Interview Optional

— +27.8% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 827 resolved cases, 2023–2026

Examiner Intelligence

ELAHEE, MD S View full profile →

Grants 79% — above average

Career Allow Rate

655 granted / 827 resolved

+17.2% vs TC avg

Strong +28% interview lift

Without

With

+27.8%

Interview Lift

resolved cases with interview

Typical timeline

3y 3m

Avg Prosecution

28 currently pending

Career history

855

Total Applications

across all art units

Statute-Specific Performance

§101

6.2%

-33.8% vs TC avg

§103

50.4%

+10.4% vs TC avg

§102

20.0%

-20.0% vs TC avg

§112

8.9%

-31.1% vs TC avg

Black line = Tech Center average estimate • Based on career data from 827 resolved cases

Office Action

§102 §103 §112

75Notice of Pre-AIA  or AIA  Status
          The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION       

Claim Objections
Regarding claim 7, the phrase “number of metrics in the plurality of metrics” in lines 7-8 should apparently be “number of metrics of the plurality of metrics”. Appropriate correction is required.
Regarding claim 12, the phrase “ML model” in line 7 should apparently be “machine learning ("ML") model”. Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):

(B)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention. 


Claim 7 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
            
           Regarding claim 7, the phrase “computing a ratio of the number of metrics to the number of metrics” in line 7 of the claim is indefinite. There are two “the number of metrics”. It is unclear whether the claimed “ratio” is being computed between two different “the number of metrics” or two same “the number of metrics”.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.



Claims 1-3, 12, 13, 17 and 18 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Rao at el. (US Pub. No. US12,047,536B1).
            Regarding claim 1, with respect to Figures 1-8, Rao teaches method, comprising:
           joining a first client device to a first media conference [i.e., video conference], a first plurality of client devices connected to the first video conference (fig. 3A, step 302; col. 10, lines 1-14: "media conference is initiated"); 
             receiving, from the first client device, a first input signal [i.e., audio stream] (fig. 3A, step 306; col. 10, lines 28-37: "first input signal is received from the first input device", "first input signal can transmit audio data"); 
            determining, using an input signal quality determination component/ machine learning algorithm [i.e., machine learning ("ML") model], at least one first audio quality measurement based on the first audio stream (fig. 3A, step 308; col. 4, lines 13-15 and 51-61: "machine learning algorithm can be applied by the input signal quality determination component"; col. 10, lines 38-56; col. 11, lines 9-19: "muffle can be determined, for example, using a learning algorithm trained on prior speech"); 
            computing a quality difference between the first input signal and the second input signal / first characteristic [i.e., first metric] for the first audio stream based on the at least one first audio quality measurement (col. 11, line 47 - col. 12 line 38, col.15, lines 11-14) (Note; comparison of "characteristic" to "thresholds" and “quality difference” imply that these two must be provided as a numerical metric.); and 
            in response to the quality difference between the first input signal and the second input signal [i.e., first metric] is greater than [i.e., satisfying] a predetermined threshold, outputting a message including first information about a first low-quality audio status associated with the first audio stream (fig. ЗA, steps 310, 312, 322; col. 14, line 64 - col. 15, line 17: "the input device selection component 222 can send a notification to the participant that a change in input device is recommended"). 

            Regarding claim 2, 13 and 18, Rao teaches in response to the first metric exceeding the predetermined threshold, outputting a recommendation [i.e., command] (col. 15, lines 1-17: "change [...] recommended") to cause a corrective action to remedy the first low-quality audio status associated with the first audio stream (col. 15, lines 7-10: "change input device automatically").

            Regarding claim 3, Rao teaches wherein the message further includes one or more recommendations to remedy the first low-quality audio status associated with the first audio stream (col. 15, lines 1-7: "change [...] recommended", col. 15, lines 7-10: "change input device automatically").


            Claims 12 is rejected for the same reasons as discussed above with respect to claim 1.
Furthermore, Rao teaches a non-transitory computer-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform operations (col. 29, lines 17-42, col. 31, lines 25-36, claim 6).
       
           Claims 17 is rejected for the same reasons as discussed above with respect to claim 1.
Furthermore, Rao teaches a system comprising:
            one or more processors (fig.8; col.25, lines 27-32, col.28, lines 21-51); and 
           one or more computer-readable storage media storing instructions which, when executed by the one or more processors, cause the one or more processors to perform operations (col. 29, lines 17-42, col. 31, lines 25-36, claim 6).

Claims 1, 2, 12, 13, 17, and 18 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Yu (US Pub. No. 2023/0247077).
            Regarding claim 1, with respect to Figures 1-10, Yu teaches method, comprising:
           joining a first client device to a first video conference, a first plurality of client devices connected to the first video conference (paragraph 0095); 
             receiving, from the first client device, a first audio stream (fig. 10, step 1010; paragraph 0095; "server may obtain an audio output from a participant device"); 
            determining, using a machine learning ("ML") model, at least one first audio quality measurement based on the first audio stream (fig. 10, step 1020; paragraph 0096: "server may detect that a quality associated with the audio output of the participant device has decreased during the video conference" and "the server may use a machine learning model to detect that the quality has decreased"); 
            computing a first metric for the first audio stream based on the at least one first audio quality measurement (paragraph 0096: "comparing the measurement to a threshold" necessarily implies representation of ML model output as a numerical metric); and 
            in response to the first metric satisfying a predetermined threshold, outputting a message including first information about a first low-quality audio status associated with the first audio stream (fig. 10, steps 1040, 1050; paragraphs 0062, 0097, 0098: "server may provide an alert to the participant indicating that the quality associated with the audio output has decreased") (Note; In paragraph 0062, Yu teaches comparing the measurement to a minimum acceptable performance level used as a threshold. If the measurement is below the threshold, the client application 420A may detect that the quality associated with the audio output is insufficient and/or has decreased such that a change to the communications system 430A should be made.) 

            Regarding claims 2, 13 and 18, Yu teaches in response to the first metric exceeding the predetermined threshold, outputting a command to cause a corrective action to remedy the first low-quality audio status associated with the first audio stream (paragraphs 0062, 066, 0097, 0098).

           Claims 12 is rejected for the same reasons as discussed above with respect to claim 1.
Furthermore, Yu teaches a non-transitory computer-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform operations (paragraphs 0105, 0106, claim 18).
       
           Claims 17 is rejected for the same reasons as discussed above with respect to claim 1.
Furthermore, Yu teaches a system comprising:
            one or more processors (paragraphs 0037, 0038, 0105, claim 18); and 
            one or more computer-readable storage media storing instructions which, when executed by the one or more processors, cause the one or more processors to perform operations (paragraphs 0105, 0106, claim 18).



Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 8-10, 15, 16 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Rao at el. (US Pub. No. US12,047,536B1) in view of Gu et al. (Chinese Pub. No. CN104424208B).  

            Regarding claim 8, Rao teaches wherein the artificial intelligence/machine learning algorithm [i.e., ML model] comprises a low-quality audio detection model trained to determine a low-quality audio (col.6, lines 55-65) (Note; since the determination of the quality of the input signal 108 can also be based on the detection of unsavory sounds (see col.6, lines 61-62), in order to determine the low-quality audio, artificial intelligence/machine learning algorithm must have a low-quality audio detection model.). However, Rao does not specifically teach outputting a low-quality audio probability measurement. Gu teaches outputting a low-quality audio probability measurement (10th paragraph of page 3 and 2nd paragraph of page 4). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Rao to incorporate the feature of the outputting a low-quality audio probability measurement in Rao’s invention as taught by Gu. The motivation for the modification is to do so in order to improve accuracy of the message screening and simplifies operation. 

          Regarding claim 9, Rao teaches wherein the ML model is trained using training data comprising a plurality of historical input signals and data indicating acceptable or unacceptable quality [i.e., low-quality audio samples and a plurality of high-quality audio samples] (col.6, lines 55-65, col.14, lines 56-58).

         Regarding claims 10 and 16, Rao teaches wherein the ML model comprises a multiple participants [i.e., speakers] detection model trained to low-quality audio detection model trained to output a measurement that the composite in put signal [i.e., first audio stream] includes a plurality of input signals [i.e., multiple voices] (fig.4, step 418; col.1, line 64-col.2, line 10, col.6, lines 55-65, col.22, lines 13-25) (Note; since the determination of the quality of the input signal 108 can also be based on the detection of unsavory sounds (see col.6, lines 61-62), in order to determine the low-quality audio, artificial intelligence/machine learning algorithm must have a low-quality audio detection model.). However, Rao does not specifically teach outputting a probability measurement. Gu teaches outputting a probability measurement (10th paragraph of page 3 and 2nd paragraph of page 4). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Rao to incorporate the feature of the outputting a probability measurement in Rao’s invention as taught by Gu. The motivation for the modification is to do so in order to improve accuracy of the message screening. 

        Claims 15 is rejected for the same reasons as discussed above with respect to claims 8 and 9.

        Claims 20 is rejected for the same reasons as discussed above with respect to claims 8-10.

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Rao at el. (US Pub. No. US12,047,536B1) in view of KAKUMOTO et al. (Taiwanese Pub. No. TW 201801546 A).  

            Regarding claim 11, Rao teaches wherein computing the first metric for the first audio stream based on the at least one first audio quality measurement comprises: receiving a plurality of audio quality measurements including the at least one first audio quality measurement (fig. 3A, step 308; col. 4, lines 13-15 and 51-61, col. 11, line 47 - col. 12 line 38, col.15, lines 11-14). However, Rao does not specifically teach determining, using a smoothing module, a smoothing characteristic [i.e., smoothed audio quality] measurement and determining the first metric based on the smoothed audio quality measurement. KAKUMOTO teaches determining, using a smoothing module, a smoothed audio quality measurement and determining the strength characteristics/ geometric mean of the intensities [i.e., first metric] based on the smoothed audio quality measurement (6th paragraph and 8th paragraph of page 3 and 2nd paragraph of page 12; 2nd portion of claim). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Rao to incorporate the feature of determining, using a smoothing module, a smoothed audio quality measurement and determining the first metric based on the smoothed audio quality measurement in Rao’s invention as taught by Gu. The motivation for the modification is to do so in order to provide more pleasant sound. 

Allowable Subject Matter
Claims 4-6, 14 and 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MD S ELAHEE whose telephone number is (571)272-7536.  The examiner can normally be reached on Monday thru Friday; 8:30AM to 5:00PM EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, FAN TSANG can be reached on 571-272-7547.  The fax phone number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).





/MD S ELAHEE/
MD SHAFIUL ALAM ELAHEE 
Primary Examiner, 
Art Unit 2694
January 26, 2026

Read full office action

Prosecution Timeline

Jul 25, 2024

Application Filed

Jan 23, 2026

Non-Final Rejection — §102, §103, §112

Apr 14, 2026

Examiner Interview Summary

Apr 14, 2026

Applicant Interview (Telephonic)

Precedent Cases

Applications granted by this same examiner with similar technology

18/611,079

Patent 12604133

SYSTEM AND METHOD OF ASSEMBLING AN ADJUSTABLE CLAMPING EAR CUP ASSEMBLY FOR AN AUDIO HEADSET

2y 5m to grant Granted Apr 14, 2026

18/070,128

Patent 12596891

CROSS-LINGUAL NATURAL LANGUAGE UNDERSTANDING MODEL FOR MULTI-LANGUAGE NATURAL LANGUAGE UNDERSTANDING (mNLU)

2y 5m to grant Granted Apr 07, 2026

18/099,577

Patent 12598260

HYBRID DIGITAL SIGNAL PROCESSING-ARTIFICIAL INTELLIGENCE ACOUSTIC ECHO CANCELLATION FOR VIRTUAL CONFERENCES

2y 5m to grant Granted Apr 07, 2026

18/754,631

Patent 12597412

Contextual Digital Assistant for Presentation Assistance

2y 5m to grant Granted Apr 07, 2026

18/540,283

Patent 12585889

NATURAL LANGUAGE GENERATION

2y 5m to grant Granted Mar 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

79%

Grant Probability

99%

With Interview (+27.8%)

3y 3m

Median Time to Grant

Low

PTA Risk

Based on 827 resolved cases by this examiner. Grant probability derived from career allow rate.

LOW-QUALITY AUDIO DETECTION

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email