Last updated: April 19, 2026

Application No. 18/787,432

DETERMINING SPEED CHANGE RATIO FOR AUDIO SAMPLES

Non-Final OA §101§103

Filed

Jul 29, 2024

Examiner

WEAVER, ADAM MICHAEL

Art Unit

2658

Tech Center

2600 — Communications

Assignee

Lemon Inc.

OA Round

1 (Non-Final)

Interview Optional

— +20.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 12 resolved cases, 2023–2026

Examiner Intelligence

WEAVER, ADAM MICHAEL View full profile →

Grants 92% — above average

Career Allow Rate

11 granted / 12 resolved

+29.7% vs TC avg

Strong +20% interview lift

Without

With

+20.0%

Interview Lift

resolved cases with interview

Typical timeline

2y 9m

Avg Prosecution

27 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

33.2%

-6.8% vs TC avg

§103

44.7%

+4.7% vs TC avg

§102

19.0%

-21.0% vs TC avg

§112

2.1%

-37.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 12 resolved cases

Office Action

§101 §103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 09/09/2025 is being considered by the examiner.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim(s) 1-20 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Independent claims 1, 11, and 20 recite “receive a first audio sample and a second audio sample”, “determine a speed change ratio between the first audio sample and the second audio sample”, “computing a similarity matrix”, “identifying a plurality of peak points”, “identifying one or more peak lines”, “and computing the speed change ratio”, and “output the speed change ratio". These limitations, as drafted, are a process that, under a broadest reasonable interpretation, covers the abstract idea of “mental processes” because they cover concepts performed in the human mind, including observation, evaluation, judgement, and opinion. See MPEP 2106.04(a)(2). Nothing in the claimed elements preclude the steps from practically being performed by a person taking in audio samples, extracting features from them, computing a similarity between these features, finding peak points and peak lines, and computing a speed ratio between the two audio samples. 
This judicial exception is not integrated into a practical application because the claim is directed to a method of audio comparison, and the data gathering and analysis steps required to perform the method do not add a meaningful limitation to the method as they are insignificant extra-solution activity. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. Thus, the claims as a whole are directed to an abstract idea (Step 2A, prong two).
Claims 1, 11, and 20 do not include any additional elements that are sufficient to amount to significantly more than the judicial exception because, as discussed above with respect to integration of the abstract idea into a practical application, the data gathering and analysis steps required to perform the method do not add a meaningful limitation to the method as they are insignificant extra-solution activity. Mere data gathering and analysis do not provide an inventive concept (Step 2B).
Dependent claims 2-10 and 12-19 are directed to describing the identification of the peak points and lines. These limitations are also related to the abstract idea of “mental processes.” That is, nothing in the claimed elements preclude the steps from practically being performed by a person taking in audio samples, extracting features from them, computing a similarity between these features, finding peak points and peak lines, and computing a speed ratio between the two audio samples. 


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 3, and 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al. (US Patent Application Publication No. 2021/0165827), hereinafter referred to as Li, in view of Wang et al. (US Patent No. 7,627,477), hereinafter referred to as Wang.

Regarding claim 1, Li discloses a computing system comprising: one or more processing devices configured to: receive a first audio sample and a second audio sample ("The audio fingerprint comprises a first part configured for indicating a content feature of the query audio and a second part configured for indicating credibility of the first part," Li para [0006]);
determine the first audio sample and the second audio sample at least in part by: extracting a set of first audio features from the first audio sample and a set of second audio features from the second audio sample (Li Fig. 14 shows Multi-type first audio fingerprint obtainer, reference character 1300, and Multi-type second audio fingerprint obtainer, reference character 1400);
computing a similarity matrix including a plurality of similarity values between the set of first audio features and the set of second audio features (Li Fig. 5 reference character S43 shows determining a similarity matrix between the first candidate audio and the query audio);
identifying a plurality of peak points in the similarity matrix (Li Fig. 7 reference character S44-2a shows selecting extreme value points, i.e. peak points);
identifying one or more peak lines that each include two or more of the peak points (Li Fig. 7 reference character S44-2b shows fitting a line based on the extreme value points);
and computing based at least in part on one or more respective slopes of the one or more peak lines ("The straight line also includes straight lines of which the slope is not 1. For example, the straight line can be a straight line of which the slope is almost 1 to improve audio retrieval and recognition robustness; the straight line can be straight lines of which the slopes are 2, 3, . . . or ½, ⅓, . . . and the like to cope with the retrieval and recognition of the audio subjected to speed regulation; and the straight line even can be straight lines (which are from lower left to right upper in the similarity matrix) of which the slope is a negative number to cope with the retrieval and recognition of the audio subjected to backward playing processing," Li para [0138]).
However, Li fails to disclose a speed change ratio between; the speed change ratio; and output the speed change ratio. Wang discloses a method for invariant audio pattern matching. 
Wang teaches a speed change ratio between ("Histogram 720 of FIG. 7B illustrates a peak of accumulated relative playback speed ratios indicating the global relative playback speed ratio R," Wang col. 7 lines 51-53); 
the speed change ratio ("Histogram 720 of FIG. 7B illustrates a peak of accumulated relative playback speed ratios indicating the global relative playback speed ratio R," Wang col. 7 lines 51-53); 
and output the speed change ratio ("Histogram 720 of FIG. 7B illustrates a peak of accumulated relative playback speed ratios indicating the global relative playback speed ratio R," Wang col. 7 lines 51-53).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Li’s method of audio identification by comparing two audio signals by incorporating Wang’s method of using the playback speed ratio to compare audios. This, alongside frequency and tempo, are useful markings of audio signals that are used for comparison. Utilizing a ratio of speed or any of the aforementioned values allows for two audio signals to be easily compared by simplifying the comparison to a single value, i.e. a ratio. 

Regarding claim 3, Li, in view of Wang, discloses all of the limitations of claim 1. Li further discloses wherein the one or more processing devices are configured to identify the plurality of peak points as the K highest similarity values included in the similarity matrix, where K is a predefined peak count ("Step S44-2a, a plurality of points with a highest single similarity are selected from the similarity matrix as similarity extreme value points; the specific amount of the similarity extreme value points can be preset," Li para [0152]).

As to claim 11, method claim 11 and system claim 1 are related as system and method of using same, with each claimed element’s function corresponding to the system step. Accordingly, claim 11 is similarly rejected under the same rationale as applied above with respect to the system claim.

Claim(s) 2, 12 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li, in view of Wang, and further in view of Zhao et al. (CN114464214 A), hereinafter referred to as Zhao.

Regarding claim 2, Li, in view of Wang, discloses all of the limitations of claim 1. However, Li fails to disclose wherein the one or more processing devices are configured to extract the set of first audio features and the set of second audio features at a feature extraction neural network.
Zhao discloses a method for computing audio similarity. 
Zhao teaches wherein the one or more processing devices are configured to extract the set of first audio features and the set of second audio features at a feature extraction neural network ("The method comprises: inputting the first audio and the second audio to respectively detected to the neural network model, performing feature extraction based on the first audio and the second audio respectively," Zhao Abstract).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Li’s method of audio identification by comparing two audio signals by incorporating Zhao’s method of utilizing a neural network to extract audio features. Using a neural network to extract audio features provides higher accuracy, automated learning, and better resilience to noise than those of manual or other mainstream methods. 

As to claim 12, method claim 12 and system claim 2 are related as system and method of using same, with each claimed element’s function corresponding to the system step. Accordingly, claim 12 is similarly rejected under the same rationale as applied above with respect to the system claim.

Regarding claim 20, Li discloses a computing system comprising: one or more processing devices configured to: receive a first audio sample and a second audio sample ("The audio fingerprint comprises a first part configured for indicating a content feature of the query audio and a second part configured for indicating credibility of the first part," Li para [0006]);
determine the first audio sample and the second audio sample at least in part by: extracting a set of first audio features from the first audio sample and a set of second audio features from the second audio sample (Li Fig. 14 shows Multi-type first audio fingerprint obtainer, reference character 1300, and Multi-type second audio fingerprint obtainer, reference character 1400);
computing a similarity matrix including a plurality of similarity values between the set of first audio features and the set of second audio features (Li Fig. 5 reference character S43 shows determining a similarity matrix between the first candidate audio and the query audio);
identifying a plurality of peak points in the similarity matrix (Li Fig. 7 reference character S44-2a shows selecting extreme value points, i.e. peak points);
identifying one or more peak lines that each include two or more of the peak points (Li Fig. 7 reference character S44-2b shows fitting a line based on the extreme value points);
and computing as a mean of one or more respective slopes of the one or more peak lines ("The straight line also includes straight lines of which the slope is not 1. For example, the straight line can be a straight line of which the slope is almost 1 to improve audio retrieval and recognition robustness; the straight line can be straight lines of which the slopes are 2, 3, . . . or ½, ⅓, . . . and the like to cope with the retrieval and recognition of the audio subjected to speed regulation; and the straight line even can be straight lines (which are from lower left to right upper in the similarity matrix) of which the slope is a negative number to cope with the retrieval and recognition of the audio subjected to backward playing processing," Li para [0138] AND “Specifically, the straight-line similarity of one straight line can be set as the mean value of unit similarities contained in the straight line, or can be set as the sum value of unit similarities contained in the straight line,” Li para [0145]).
However, Li does not disclose a speed change ratio between; at a feature extraction neural network; the speed change ratio; and output the speed change ratio. 
Wang teaches a speed change ratio between ("Histogram 720 of FIG. 7B illustrates a peak of accumulated relative playback speed ratios indicating the global relative playback speed ratio R," Wang col. 7 lines 51-53); 
the speed change ratio ("Histogram 720 of FIG. 7B illustrates a peak of accumulated relative playback speed ratios indicating the global relative playback speed ratio R," Wang col. 7 lines 51-53); 
and output the speed change ratio ("Histogram 720 of FIG. 7B illustrates a peak of accumulated relative playback speed ratios indicating the global relative playback speed ratio R," Wang col. 7 lines 51-53).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Li’s method of audio identification by comparing two audio signals by incorporating Wang’s method of using the playback speed ratio to compare audios. This, alongside frequency and tempo, are useful markings of audio signals that are used for comparison. Utilizing a ratio of speed or any of the aforementioned values allows for two audio signals to be easily compared by simplifying the comparison to a single value, i.e. a ratio. 
Zhao teaches at a feature extraction neural network ("The method comprises: inputting the first audio and the second audio to respectively detected to the neural network model, performing feature extraction based on the first audio and the second audio respectively," Zhao Abstract).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Li’s method of audio identification by comparing two audio signals by incorporating Zhao’s method of utilizing a neural network to extract audio features. Using a neural network to extract audio features provides higher accuracy, automated learning, and better resilience to noise than those of manual or other mainstream methods. 

Allowable Subject Matter
Claim(s) 4-10 and 13-19 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter: 

Regarding claims 4-7, Li, in view of Wang, discloses all of the limitations of claim 1. Li does not disclose wherein the one or more processing devices are configured to: identify the one or more peak lines at least in part by: selecting a list of candidate peak sets that each include a predefined number of the peak points; and over a plurality of filtering stages, computing a filtered list of the candidate peak sets; and compute the speed change ratio as a mean slope value of the candidate peak sets included in the filtered list.

Regarding claims 8-10, Li, in view of Wang, discloses all of the limitations of claim 1. Li does not disclose wherein the one or more processing devices are configured to identify the one or more peak lines at least in part by, for each peak point included in a subset of the plurality of peak points: computing respective candidate slopes between the peak point and a plurality of candidate endpoints included among the plurality of peak points; and for each of the candidate endpoints: determining whether the candidate slope is within a predefined slope range; and adding the candidate slope and the candidate endpoint to a candidate line map if the candidate slope is within the predefined slope range.

Regarding claims 13-16, Li, in view Wang, discloses all of the limitations of claim 11. Li does not disclose wherein: identifying the one or more peak lines includes: selecting a list of candidate peak sets that each include a predefined number of the peak points; and over a plurality of filtering stages, computing a filtered list of the candidate peak sets; and computing the speed change ratio as a mean slope value of the candidate peak sets included in the filtered list.

Regarding claims 17-19, Li, in view Wang, discloses all of the limitations of claim 11. Li does not disclose wherein identifying the one or more peak lines includes, for each peak point included in a subset of the plurality of peak points: computing respective candidate slopes between the peak point and a plurality of candidate endpoints included among the plurality of peak points; and for each of the candidate endpoints: determining whether the candidate slope is within a predefined slope range; and adding the candidate slope and the candidate endpoint to a candidate line map if the candidate slope is within the predefined slope range.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
CN1446350A
US Patent Application Publication No. 2014/0214190
US Patent Application Publication No. 2024/0176815

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ADAM MICHAEL WEAVER whose telephone number is (571)272-7062. The examiner can normally be reached Monday-Friday, 8AM-5PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached at (571) 272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/ADAM MICHAEL WEAVER/               Examiner, Art Unit 2658                                                                                                                                                                                         

/RICHEMOND DORVIL/               Supervisory Patent Examiner, Art Unit 2658

Read full office action

Prosecution Timeline

Jul 29, 2024

Application Filed

Feb 18, 2026

Non-Final Rejection — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/161,033

Patent 12591752

ZERO-SHOT DOMAIN TRANSFER WITH A TEXT-TO-TEXT MODEL

2y 5m to grant Granted Mar 31, 2026

18/371,878

Patent 12585765

SYSTEM AND METHOD FOR ROBUST NATURAL LANGUAGE CLASSIFICATION UNDER CHARACTER ENCODING

2y 5m to grant Granted Mar 24, 2026

18/378,249

Patent 12579375

IMPLEMENTING ACTIVE LEARNING IN NATURAL LANGUAGE GENERATION TASKS

2y 5m to grant Granted Mar 17, 2026

18/147,118

Patent 12562077

METHOD, COMPUTING DEVICE, AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM TO TRANSLATE AUDIO OF VIDEO INTO SIGN LANGUAGE THROUGH AVATAR

2y 5m to grant Granted Feb 24, 2026

Study what changed to get past this examiner. Based on 4 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

92%

Grant Probability

99%

With Interview (+20.0%)

2y 9m

Median Time to Grant

Low

PTA Risk

Based on 12 resolved cases by this examiner. Grant probability derived from career allow rate.