Prosecution Insights
Last updated: April 17, 2026
Application No. 18/716,722

DIGITAL AUDIO MEASUREMENT SYSTEMS AND METHOD

Non-Final OA §101§102§103
Filed
Jun 05, 2024
Examiner
PATEL, SHREYANS A
Art Unit
2659
Tech Center
2600 — Communications
Assignee
unknown
OA Round
1 (Non-Final)
89%
Grant Probability
Favorable
1-2
OA Rounds
2y 3m
To Grant
96%
With Interview

Examiner Intelligence

Grants 89% — above average
89%
Career Allow Rate
359 granted / 403 resolved
+27.1% vs TC avg
Moderate +7% lift
Without
With
+7.4%
Interview Lift
resolved cases with interview
Typical timeline
2y 3m
Avg Prosecution
46 currently pending
Career history
449
Total Applications
across all art units

Statute-Specific Performance

§101
21.3%
-18.7% vs TC avg
§103
36.0%
-4.0% vs TC avg
§102
22.6%
-17.4% vs TC avg
§112
8.8%
-31.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 403 resolved cases

Office Action

§101 §102 §103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1-57 are rejected under 35 U.S.C. 101: These claims, the independent claims (claims 1, 49, and 54) are segmenting a digital audio file into time blocks/windows and performing loudness/statistical calculations (LUFS values, differences vs. a reference/baseline, averages/means, standard deviation/distribution-type analyses) to generate new/updated indicator values (impact, LIV/LOCL/WCL-type metrics). These mathematical calculations are recognized as an abstract idea of grouping (“mathematical concepts”). Dependent claims 1-48 and 50-53 then add actions that use, organizing, or presenting the calculated results: e.g., processing … based on the indicator, rendering/playback, sequencing/grouping and playlist compilation, associating mood tags/qualifiers, associating with platform performance, integrating with an external reference, and visualizing/outputting the indicator. These additions resemble post solution activity (use/display/classify the computed number) or information management rather than a claim recited technological improvement in audio processing or computer operation. Furthermore, dependent claims further add (e.g., 50 ms/400 ms blocks, 3 s/10 s windows, 75% overlap, ungated/gated LUFS, baseline averaging, separate above/below baseline averages, frequency band filtering, real time generation, and generic signal processing like gain normalization, EQ/spectral dynamics, multiband compression/expansion) recite a high level functional goals/results rather than a non-conventional technical implementation. When claims do not recite the concrete how (a specific architecture or specific way of achieving the stated result), the courts find no inventive concept. The claims as a whole are (i) mathematical loudness/data analysis and (ii) organization/presentation of the resulting indicator on generic computing/audio components. The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the claims are (i) mere instructions to implement the idea on a computer, and/or (ii) recitation of generic computer structure that serves to perform generic computer functions that are well-understood, routine, and conventional activities previously known to the pertinent industry. Viewed as a whole, these additional claim element(s) do not provide meaningful limitation(s) to transform the abstract idea into a patent eligible application of the abstract idea such that the claim(s) amounts to significantly more than the abstract idea itself. Therefore, the claim(s) are rejected under 35 U.S.C. 101 as being directed to non-statutory subject matter. There is further no improvement to the computing device. Claim Rejections - 35 USC § 102 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention. Claims 1, 7-9, 18-24, 29, 44, 46, 48-49 and 52-53 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Tech 3342 (“Loudness Range: A descriptor to supplement loudness normalization in accordance with EBU R 128”; Geneva 2021; pgs. 1-9). Claim 1, Tech teaches dividing, by a processor, a digital audio file into a plurality of blocks in a time sequence at a first time length ([Section 3.1] [MATLAB] time sequenced analysis windows (blocks); sliding analysis-window of length 3 seconds and a min block overlap of 68% between consecutive analysis windows is required; also in MATLAB: input ShortTermLoudness is a vector of loudness level); determining, by the processor, a respective Loudness Units relative to Full Scale (LUFS) for each block ([Section 3.1] [MATLAB] the input is a vector of loudness levels using a sliding analysis window and uses LUFS explicitly; the absolute threshold is set to -70 LUFS); determining, by the processor, a difference between the LUFS of each block with a reference value ([Section 3.1] [MATLAB] reference values (thresholds) and relative difference: the relative threshold is set to a level of -20 LU relative to the absolute-gated loudness level); and generating, by the processor, an indicator from one or more of the differences of the plurality of blocks within a second time length ([Sections 3 & 3.1] [MATLAB] generates a LRA (loudness range indicator) from block loudness values over a longer segment; it is computed based on the statistical distribution of loudness so that a short event would not affect a longer segment; the indicator as a difference between percentiles). Claim 7, Tech further teaches the method of claim 1, wherein the second time length is 3 seconds ([3.1] 3 seconds). Claim 8, Tech further teaches the method of claim 1, wherein the second time length is 10 seconds ([3.1] 3 seconds; this a design parameter which can be changed by the developer/user). Claim 9, Tech further teaches the method of claim 1, wherein the second time length is an entire duration of the digital audio file ([3.1] leading or trailing silence to isolate utterances). Claim 18, Tech further teaches the method of claim 1, wherein the first and second average values represent a local organized clusters (LOCL) value of the digital audio file ([Introduction] average loudness). Claim 19, Tech further teaches the method of claim 1, wherein the indicator is a mean value of the one or more of the differences ([5.] calculate mean). Claim 20, Tech further teaches the method of claim 1, wherein the indicator indicates a relative level of perceived impact from the digital audio file ([5.] relative threshold). Claim 21, Tech further teaches the method of claim 1, wherein the indicator indicates a relative level of perceived real impact value from the digital audio file ([5.] relative threshold). Claim 22, Tech further teaches the method of claim 1, wherein the indicator indicates a relative level of perceived textural impact value from the digital audio file ([5.] relative threshold). Claim 23, Tech further teaches the method of claim 1, wherein the indicator indicates a relative level of local organized clusters (LOCL) of the digital audio file ([5.] relative threshold). Claim 24, Tech further teaches the method of claim 23, wherein the indicator comprises one or more local organized clusters (LOCL) analysis comprising one or more of, within the second time length: a Standard Deviation of LUFS, a measures of average LUFS, a comparative analysis of LUFS, or a distribution analysis of LUFS ([3.1] sliding analysis-window). Claim 29, Tech further teaches the method of claim 1, wherein the indicator is generated in real time ([2.] time-varying). Claim 44, Tech further teaches the method of claim 1, further comprising analyzing, by the processor, the plurality of digital audio files using machine learning ([3.1] algorithm to process the audio). Claim 46, Tech further teaches the method of claim 1, further comprising associating the indicator of the digital audio file with one or more qualifiers ([Table 1] narrow NLR programme segment and WLR programme segment). Claim 48, Tech further teaches the method of claim 1, wherein each block has a size of an entire block, a half, quarter,8th, 16th, 32nd, and 64th notes ([3.1] block overlay; this a design parameter which can be changed by the developer/user). Claim 49, A method comprising: dividing, by a processor, a digital audio file into a plurality of windows in a time sequence at a first time length; dividing, by the processor, each of the plurality windows into a plurality of blocks in a time sequence at a second time length; determining, by the processor, a respective Loudness Units relative to Full Scale (LUFS) for each of the plurality blocks; determining, by the processor, a difference between the LUFS of each of the plurality blocks with a reference value; and generating, by the processor, a Linear Impact (LIV) value for each of the plurality of the windows. (Claim 49 contains subject matter similar to claim 1, and thus is rejected under similar rationale) Claim 52, Tech further teaches the method of claim 49, wherein the reference value is a LUFS of a next block of the each of the plurality blocks ([Fig. 1] threshold; three different blocks). Claim 53, Tech further teaches the method of claim 49, further comprising generating one or more indicators within a third time length based on the LIV value of the each of the plurality of windows ([3.1] sliding windows). Claim 54 is rejected under 35 U.S.C. 102(a)(1) as being anticipated by Tracey (US 2013/0272543). Claim 54, Tracey teaches a method comprising: dividing, by a processor, a digital audio file into a plurality of windows in a time sequence at a first time length ([Abstract] [0039] dividing the audio signal into a plurality of frames; the length of the frame is 12 secs.. different frames lengths.. 20 secs, 30 secs); dividing, by the processor, each of the plurality windows into a plurality of blocks in a time sequence at a second time length and each of the plurality blocks having a 75% overlap with a previous block ([0045] partition each frame into overlapping windows or blocks.. the overlay is 75%; duration windows are 400 msec); determining, by the processor, a respective Loudness Units relative to Full Scale (LUFS) for each of the plurality blocks ([0044] [0048] measures instantaneous loudness L_Wk for each window in a frame); determining, by the processor, an average LUFS value of the plurality blocks within the each of the plurality of window ([0049-0050] determines mean loudness of the frame and where M is the number of overlapping windows in the frame); and generating, by the processor, a local organized clusters (LOCL) for each of the plurality of windows within the first time length based on the average LUFS value of the plurality of blocks for each of the plurality of the windows ([0065] determines gated measured loudness 365 of the frame.. computes a weighted average of loudness values for windows). Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 2-4, 5-6, 10, 12, 17, 25, 27-28, 30, 35-40, 42-43, 45, 47, 50-51 and 55-57 are rejected under 35 U.S.C. 103 as being unpatentable over Tech 3342 (“Loudness Range: A descriptor to supplement loudness normalization in accordance with EBU R 128”; Geneva 2021; pgs. 1-9) and further in view of Baumgarte et al. (US 2014/04294200). Claim 2, Tech teaches all the limitations in claim 1. Tech further teaches a loudness descriptor for normalization (pg. 5). The difference between the prior art and the claimed invention is that Tech does not explicitly teach processing, by the processor, the digital audio file based on the indicator among a plurality of audio files. Baumgarte teaches processing, by the processor, the digital audio file based on the indicator among a plurality of audio files ([0003-0005] [0010-0011] [0060-0064] multi-file processing based on loudness descriptors; SoundCheck adjust playback volume across songs (music and movie files); metadata of an audio file is used to adjust loudness automatically and info can include loudness range; dynamic range descriptor can drive playback processing decisions). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Tech with teachings of Baumgarte by modifying the loudness range, a descriptor to supplement loudness normalization as taught by Tech to include processing, by the processor, the digital audio file based on the indicator among a plurality of audio files as taught by Baumgarte for the benefit of improving the quality of playback of the decoded signal in a consumer electronics end user device (Baumgarte [0002]). Claim 3, Baumgrate further teaches the method of claim 2, wherein processing the digital audio file comprises rendering the audio file for playback among the plurality of the audio files ([0003] [0011] encoded audio is decoded before being presented to the consumer (playback) and metadata can be read by a media player during decoding and used in playback processing). Claim 4, Tech further teaches the method of claim 2, wherein processing the digital audio file comprises sequencing or grouping of the audio file for among the plurality of digital audio files ([Table 1] narrow NLR programme segment vs WLR programme segment). Claim 5-6, Claim 5: Tech does not explicitly teach the method of claim 1, wherein the first time length is 400 ms. Baumgarte teaches wherein the first time length is 400 ms ([0060] 400 ms block). Claim 6: Tech does not explicitly teach the method of claim 1, wherein the first time length is 50 ms. Baumgarte teaches wherein the first time length is 50 ms ([0046] [0060] 400 ms block; one frame of digital audio can be between 5 and 100 ms; this a design parameter which can be changed by the developer/user). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Tech with teachings of Baumgarte by modifying the loudness range, a descriptor to supplement loudness normalization as taught by Tech to include wherein the first time length is 400/50 ms as taught by Baumgarte for the benefit of improving the quality of playback of the decoded signal in a consumer electronics end user device (Baumgarte [0002]). Claim 10, Tech further teaches the method of claim 5, wherein each block overlaps with a next block in the time sequence by 75% of the first time length ([3.1] block overlap of 66%; this a design parameter which can be changed by the developer/user). Claim 12, Tech further teaches the method of claim 5, wherein values of the LUFS are ungated, and the reference value comprises a LUFS of a subsequent block of each block ([Fig. 1] threshold; abs-gated). Claim 17, Tech further teaches the method of claim 6, wherein the LUFS are ungated, and the reference value comprises a LUFS of a subsequent block of each block ([Fig. 1] [3.1] calculate the Integrated Loudness and apply the gating relative threshold (-20LU relative to the absolute-gated loudness level) and (the gating is applied to remove low-level signals)). Claim 25, Tech does not explicitly teach the method of claim 1, wherein processing the digital audio file comprises re-mastering the digital audio file with less compression and saturation to allow for more momentary dynamic variance. Baumgarte teaches wherein processing the digital audio file comprises re-mastering the digital audio file with less compression and saturation to allow for more momentary dynamic variance ([0030] DRC can reduce the volume of loud sounds or amplify quiet sounds, by narrowing or "compressing" an audio signal's dynamic range; compression is commonly used in sound recording and reproduction and broadcasting). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Tech with teachings of Baumgarte by modifying the loudness range, a descriptor to supplement loudness normalization as taught by Tech to include wherein processing the digital audio file comprises re-mastering the digital audio file with less compression and saturation to allow for more momentary dynamic variance as taught by Baumgarte for the benefit of improving the quality of playback of the decoded signal in a consumer electronics end user device (Baumgarte [0002]). Claim 27, Tech does not explicitly teach the method of claim 1, wherein the digital audio file contains PCM audio data. Baumgarte teaches wherein the digital audio file contains PCM audio data ([0026] pulse code modulated). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Tech with teachings of Baumgarte by modifying the loudness range, a descriptor to supplement loudness normalization as taught by Tech to include wherein the digital audio file contains PCM audio data as taught by Baumgarte for the benefit of improving the quality of playback of the decoded signal in a consumer electronics end user device (Baumgarte [0002]). Claim 28, Tech further teaches the method of claim 2, further comprising outputting or visualizing the indicator ([Fig. 1] visual chart). Claim 30, Tech does not explicitly teach the method of claim 1, wherein the digital audio file is associated with a user's preference comprising a user-defined mood tag. Baumgarte teaches wherein the digital audio file is associated with a user's preference comprising a user-defined mood tag ([0067] user preference; and dynamic range of the content). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Tech with teachings of Baumgarte by modifying the loudness range, a descriptor to supplement loudness normalization as taught by Tech to include wherein the digital audio file is associated with a user's preference comprising a user-defined mood tag as taught by Baumgarte for the benefit of improving the quality of playback of the decoded signal in a consumer electronics end user device (Baumgarte [0002]). Claim 35, Baumgarte further teaches the method of claim 2, wherein processing the digital audio file comprises normalizing the digital audio file using attenuation or amplification so that the indicator is within a target range ([0040] power amplifier with dynamic range). Claim 36, Baumgarte further teaches the method of claim 2, wherein processing the digital audio file comprises spectral equalization or spectral dynamics for increasing or decreasing perceived loudness, envelope filtering, transient suppression, transient enhancement, frequency-based dynamics processing comprising high-frequency limiting or high-frequency expansion, or multiband compression or expansion, so that the digital audio file achieves a target indicator ([0055] three different DRC characteristics where each is associated with a different user volume setting or level; as can be seen in Fig. 6, as the volume increases, the amount of compression defined by the DRC characteristic increases when short-term loudness is increasing). Claim 37, Baumgarte further teaches the method of claim 36, wherein spectral equalization or spectral dynamics comprises one or more of: resonance suppression, high-frequency limiting, or low-frequency compression or expansion ([0004] using dynamic range control (DRC), which compresses the highs and lows of the audio signal so that the resulting audio signal can fit within a narrower envelope). Claim 38, Tech does not explicitly teach the method of claim 1, further comprising playing back the digital audio file at a user- defined level on a digital streaming platform. Baumgarte teaches playing back the digital audio file at a user- defined level on a digital streaming platform ([0011] metadata associated with an audio file or stream and read by a media player; user input such as volume setting). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Tech with teachings of Baumgarte by modifying the loudness range, a descriptor to supplement loudness normalization as taught by Tech to include playing back the digital audio file at a user- defined level on a digital streaming platform as taught by Baumgarte for the benefit of improving the quality of playback of the decoded signal in a consumer electronics end user device (Baumgarte [0002]). Claim 39, Tech does not explicitly teach the method of claim 1, further comprising processing digital audio files with one or more user defined parameters comprising bass intensity, high-frequency density, sibilance, perceived loudness, perceived impact, perceived textural impact, macrodynamic profile, tempo or beats- per-minute, genre or subgenre, lyrical content, mood defined by the user or defined by a combination of measured characteristics, key, spectral characteristics comprising bass intensity, midrange intensity, high-frequency density, sibilance, or dynamics characteristics. Baumgarte teaches processing digital audio files with one or more user defined parameters comprising bass intensity, high-frequency density, sibilance, perceived loudness, perceived impact, perceived textural impact, macrodynamic profile, tempo or beats- per-minute, genre or subgenre, lyrical content, mood defined by the user or defined by a combination of measured characteristics, key, spectral characteristics comprising bass intensity, midrange intensity, high-frequency density, sibilance, or dynamics characteristics ([0011] [0063] local parameters including user input such as volume setting; a target dynamic range can be set depending on user input volume setting). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Tech with teachings of Baumgarte by modifying the loudness range, a descriptor to supplement loudness normalization as taught by Tech to include processing digital audio files with one or more user defined parameters comprising bass intensity, high-frequency density, sibilance, perceived loudness, perceived impact, perceived textural impact, macrodynamic profile, tempo or beats- per-minute, genre or subgenre, lyrical content, mood defined by the user or defined by a combination of measured characteristics, key, spectral characteristics comprising bass intensity, midrange intensity, high-frequency density, sibilance, or dynamics characteristics as taught by Baumgarte for the benefit of improving the quality of playback of the decoded signal in a consumer electronics end user device (Baumgarte [0002]). Claim 40, Baumgarte further teaches the method of claim 2, wherein processing the digital audio file comprises sequencing or grouping of the digital audio file among the plurality of digital audio files based on real impact values, textural impact value, or quiet or loud markers of local organized clusters of the plurality of the digital audio files ([Claim 16] wherein the received metadata further includes a plurality of values selected from the group consisting of: program loudness, true peak, loudness range, maximum momentary loudness, and short-term loudness values.). Claim 42, Tech does not explicitly teach the method of claim 2, wherein processing the digital audio file is based on a user- defined profile. Baumgarte teaches wherein processing the digital audio file is based on a user- defined profile ([0055] DRC characteristic with a different user volume setting or level). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Tech with teachings of Baumgarte by modifying the loudness range, a descriptor to supplement loudness normalization as taught by Tech to include wherein processing the digital audio file is based on a user- defined profile as taught by Baumgarte for the benefit of improving the quality of playback of the decoded signal in a consumer electronics end user device (Baumgarte [0002]). Claim 43, Tech does not explicitly teach the method of claim 1, further comprising associating, by the processor, the plurality of digital audio files with their respective performance upon one or more digital consumption platforms. Baumgarte teaches associating, by the processor, the plurality of digital audio files with their respective performance upon one or more digital consumption platforms ([0005] audio program can include a metadata portion that is associated with the encoded audio signal; metadata used by software in the user device so as to change the consumer’s experience during playback). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Tech with teachings of Baumgarte by modifying the loudness range, a descriptor to supplement loudness normalization as taught by Tech to include associating, by the processor, the plurality of digital audio files with their respective performance upon one or more digital consumption platforms as taught by Baumgarte for the benefit of improving the quality of playback of the decoded signal in a consumer electronics end user device (Baumgarte [0002]). Claim 45, Tech does not explicitly teach the method of claim 2, wherein processing the digital audio file comprises modifying the digital audio file so that the indicator is within a target range. Baumgarte teaches processing the digital audio file comprises modifying the digital audio file so that the indicator is within a target range ([0063] a target dynamic range can be set and select a DRC characteristic so that the range will be reduced to the target). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Tech with teachings of Baumgarte by modifying the loudness range, a descriptor to supplement loudness normalization as taught by Tech to include processing the digital audio file comprises modifying the digital audio file so that the indicator is within a target range as taught by Baumgarte for the benefit of improving the quality of playback of the decoded signal in a consumer electronics end user device (Baumgarte [0002]). Claim 47, Baumgarte further teaches the method of claim 36, wherein the target indicator is user specified or matches one or more indicators or the plurality audio files ([0067] user preference;). Claim 50, Tech further teaches the method of claim 36, wherein the first time length is 3 seconds ([3.1] 3 seconds). Claim 51, Tech does not explicitly teach the method of claim 49, wherein the first time length is 400 ms. Baumgarte teaches wherein the first time length is 400 ms ([0046] [0060] 400 ms block; one frame of digital audio can be between 5 and 100 ms; this a design parameter which can be changed by the developer/user). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Tech with teachings of Baumgarte by modifying the loudness range, a descriptor to supplement loudness normalization as taught by Tech to include wherein the first time length is 400 ms as taught by Baumgarte for the benefit of improving the quality of playback of the decoded signal in a consumer electronics end user device (Baumgarte [0002]). Claim 55, Tech further teaches the method of claim 36, wherein the first time length is 3 seconds ([3.1] 3 seconds). Claim 56, Baumgarte further teaches the method of claim 36, wherein the second time length is 400 ms ([0060] 400 msec block). Claim 57, Tech further teaches the method of claim 36, further comprising generating a Windowed Clusters (WCL) value based on the LOCL value for each of the plurality of windows ([Introduction] average loudness). Claims 11 and 13-16 are rejected under 35 U.S.C. 103 as being unpatentable over Tech 3342 (“Loudness Range: A descriptor to supplement loudness normalization in accordance with EBU R 128”; Geneva 2021; pgs. 1-9) in view of Baumgarte et al. (US 2014/04294200) and further in view of Tracey (US 2013/0272543). Claim 11, Tech does not explicitly teach the method of claim 10, wherein the next block starts 100 ms from a start time of each block. Tracey teaches wherein the next block starts 100 ms from a start time of each block ([0048] a new loudness value can be computed every 200 msec). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Tech with teachings of Tracey by modifying the loudness range, a descriptor to supplement loudness normalization as taught by Tech to include wherein the next block starts 100 ms from a start time of each block as taught by Tracey for the benefit of selectively discarding at least some of the loudness values that reach an adaptive loudness threshold (Tracey [Abstract]). Claim 13, Tech further teaches the method of claim 10, wherein the reference value comprises a baseline value of the digital audio file which is an average LUFS in a third time length ([3.1] overall loudness for blocks as short-term loudness values in LUFS using windowed analysis). Claim 14, Tech further teaches the method of claim 13, wherein the third period is 3 seconds ([3.1] 3 seconds). Claim 15, Tech further teaches the method of claim 13, further comprising comparing the LUFS of each block to the baseline value and determining a difference between the LUFS for each block and the baseline value, wherein the LUFS is gated ([Fig. 1] [3.1] calculate the Integrated Loudness and apply the gating relative threshold (-20LU relative to the absolute-gated loudness level) and (the gating is applied to remove low-level signals). Claim 16, Tech further teaches the method of claim 15, further comprising separately generating a first average of the difference between the baseline value and LUFS values greater than the baseline value, and a second average of the difference between the baseline value and the LUFS values lower than the baseline value ([3.1] defines LRA via difference between the estimates of the 10th and the 95th percentiles). Claim 26 is rejected under 35 U.S.C. 103 as being unpatentable over Tech 3342 (“Loudness Range: A descriptor to supplement loudness normalization in accordance with EBU R 128”; Geneva 2021; pgs. 1-9) in view of Baumgarte et al. (US 2014/04294200) and further in view of Charoenruengkit et al. (US 7,822,498). Claim 26, Tech and Baumgarte do not explicitly teach the method of claim 2, further comprising compiling a playlist comprising the plurality of audio files. Charoenruengkit teaches compiling a playlist comprising the plurality of audio files ([col. 5 lines 42-50] select audio files from two or more audio files). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Tech with teachings of Charoenruengkit by modifying the loudness range, a descriptor to supplement loudness normalization as taught by Tech to include compiling a playlist comprising the plurality of audio files as taught by Chareoenruengkit for the benefit of training a speech recognition engine or speech processing system (Chareoenruengkit [Abstract]). Claims 31-34 are rejected under 35 U.S.C. 103 as being unpatentable over Tech 3342 (“Loudness Range: A descriptor to supplement loudness normalization in accordance with EBU R 128”; Geneva 2021; pgs. 1-9) and further in view of Chung et al. (US 2022/0166397). Claim 31, Tech does not explicitly teach the method of claim 1, further comprising filtering, by the processor, the digital audio file in one or more frequency bands. Chung teaches filtering, by the processor, the digital audio file in one or more frequency bands ([0037] band pass filter). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Tech with teachings of Chung by modifying the loudness range, a descriptor to supplement loudness normalization as taught by Tech to include filtering, by the processor, the digital audio file in one or more frequency bands as taught by Chung for the benefit of controlling a frequency spectrum of an input audio signal (Chung [0037]). Claim 32, Chung further teaches the method of claim 31, wherein the one or more selected frequency bands comprise one or more of: below 100 Hz, between 100 Hz and 2000 Hz, between 2000 Hz and 5000 Hz, or a between 5000 Hz and 12000 Hz ([0072] 100 Hz or lower). Claim 33, Chung further teaches the method of claim 32, wherein the indicator comprises one or more of: a low- frequency indicator for the low-frequency band, a mid-frequency indicator for the mid-frequency 41band, a high-frequency indicator for the high-frequency band, or a sibilance indicator for the sibilance frequency band ([0016] [0037] low/high frequency bands; band pass filer; the characteristic of the user device indicates a frequency band which the user device is able to output, and the recognition ability of the user indicates sensitivity of the user relative to a frequency band). Claim 34, Chung further teaches the method of claim 33, further comprising processing, by the processor, the digital audio file based on the one or more of the low-frequency indicator, the mid-frequency indicator, the high-frequency indicator, or the sibilance indicator ([0037] EQ for compensation for a user's sensitivity changing relative to a high band or low band frequency region according to a change in a reproduction level in a case where the reproduction level is changed due to adjustment of the output (volume) of an audio signal). Claim 41 is rejected under 35 U.S.C. 103 as being unpatentable over Tech 3342 (“Loudness Range: A descriptor to supplement loudness normalization in accordance with EBU R 128”; Geneva 2021; pgs. 1-9) and further in view of Cremer et al. (US 2020/0162049). Claim 41, Tech does not explicitly teach the method of claim 1, further comprising associating the digital audio file for integration with an external reference based on the indicator. Cremer teaches associating the digital audio file for integration with an external reference based on the indicator ([0064] audio analysis module can align the loudness level profile to the audio signal using fingerprinting; receive a reference fingerprint compare with query fingerprint). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Tech with teachings of Cremer by modifying the loudness range, a descriptor to supplement loudness normalization as taught by Tech to include associating the digital audio file for integration with an external reference based on the indicator as taught by Cremer for the benefit of determining loudness level of the first media content and the reference loudness level (Cremer [Abstract]). Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Vaudrey et al. (CN 1213556) – A method used to provide the multiple users adjusting ability of voice to remaining audio (VRA) includes a first decoder (14) receiving a voice signal and a remaining audio signal and simultaneously receiving the voice signal and the remaining audio signal at the second decoder (15), wherein the voice signal and the remaining audio signal are received separately, and separately adjusting the separately received in a voice signal and a remaining audio signal by each decoder. Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHREYANS A PATEL whose telephone number is (571)270-0689. The examiner can normally be reached Monday-Friday 8am-5pm PST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Desir can be reached at 571-272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. SHREYANS A. PATEL Primary Examiner Art Unit 2653 /SHREYANS A PATEL/Examiner, Art Unit 2659
Read full office action

Prosecution Timeline

Jun 05, 2024
Application Filed
Jan 27, 2026
Non-Final Rejection — §101, §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12586597
ENHANCED AUDIO FILE GENERATOR
2y 5m to grant Granted Mar 24, 2026
Patent 12586561
TEXT-TO-SPEECH SYNTHESIS METHOD AND SYSTEM, A METHOD OF TRAINING A TEXT-TO-SPEECH SYNTHESIS SYSTEM, AND A METHOD OF CALCULATING AN EXPRESSIVITY SCORE
2y 5m to grant Granted Mar 24, 2026
Patent 12548549
ON-DEVICE PERSONALIZATION OF SPEECH SYNTHESIS FOR TRAINING OF SPEECH RECOGNITION MODEL(S)
2y 5m to grant Granted Feb 10, 2026
Patent 12548583
ACOUSTIC CONTROL APPARATUS, STORAGE MEDIUM AND ACCOUSTIC CONTROL METHOD
2y 5m to grant Granted Feb 10, 2026
Patent 12536988
SPEECH SYNTHESIS METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM
2y 5m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
89%
Grant Probability
96%
With Interview (+7.4%)
2y 3m
Median Time to Grant
Low
PTA Risk
Based on 403 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in for Full Analysis

Enter your email to receive a magic link. No password needed.

Free tier: 3 strategy analyses per month