Last updated: April 19, 2026

Application No. 19/085,949

AUDIO STEM IDENTIFICATION SYSTEMS AND METHODS

Non-Final OA §102§DP

Filed

Mar 20, 2025

Examiner

LE, HUNG D

Art Unit

2161

Tech Center

2100 — Computer Architecture & Software

Assignee

Spotify AB

OA Round

1 (Non-Final)

Interview Optional

— +6.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 1073 resolved cases, 2023–2026

Examiner Intelligence

LE, HUNG D View full profile →

Grants 90% — above average

Career Allow Rate

969 granted / 1073 resolved

+35.3% vs TC avg

Moderate +6% lift

Without

With

+6.4%

Interview Lift

resolved cases with interview

Typical timeline

2y 6m

Avg Prosecution

33 currently pending

Career history

1106

Total Applications

across all art units

Statute-Specific Performance

§101

12.3%

-27.7% vs TC avg

§103

39.2%

-0.8% vs TC avg

§102

20.6%

-19.4% vs TC avg

§112

9.2%

-30.8% vs TC avg

Black line = Tech Center average estimate • Based on career data from 1073 resolved cases

Office Action

§102 §DP

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
1.	This Office Action is in response to the application filed on 03/20/2025.
Claims 1-20 are pending.

Priority
2.	This application is a Continuation of 18/090,228 (Patent US 12,283,287), which was filed on 12/28/2022, was acknowledged and considered.

Double Patenting
3. 	The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the "right to exclude" ranted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory obviousness-type double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Omum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969). 
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the conflicting application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. 
Effective January 1, 1994, a registered attorney or agent of record may sign a terminal disclaimer. A terminal disclaimer signed by the assignee must fully comply with 37 CFR 3.73(b). 
4. 	Claims 1-20 are rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claims 1-18 of U.S. Patent No. 12,283,287. Although the conflicting claims are not identical, they are not patentably distinct from each other.


Instant Application 19085949
Patent US 12,283,287
Claim 1:
A computer system, comprising:

at least one processor; and

at least one memory storing instructions which when executed by the at least one
processor cause the at least one processor to:


receive a query corresponding to a query audio content item;

determine a query vector corresponding to the query audio content item;

compare, in a vector space, the query vector and a plurality of target vectors
corresponding to a plurality of target audio content items, to determine likelihood values
indicating, for each respective target audio content item of the plurality of target audio
content items, a probability that the respective target audio content item is a match for the
query audio content item; and

cause output, via a graphical user interface, of information identifying one of the
plurality of target audio content items having a highest of the likelihood values.
Claim 1:
An audio content item identifier, comprising: 

at least one processor; and 

at least one memory storing instructions which when executed by the at least one processor cause the at least one processor to: 

receive, by a client device, a query corresponding to a query audio content item; 

determine a query vector corresponding to the query audio content item; 

compare, in a vector space, the query vector and a plurality of target vectors corresponding to a plurality of target audio content items, to determine likelihood values indicating, for each respective target audio content item of the plurality of target audio content items, a probability that the respective target audio content item is a match for the query audio content item, wherein the query audio content item and the target audio content items are audio stems that are configured to be inserted into an audio content item during an audio content item creation process; and 

output, to an editor tool, information identifying one of the plurality of target audio content items having a highest of the likelihood values.
Claim 9:
A method, comprising:

receiving a query corresponding to a query audio content item;


determining a query vector corresponding to the query audio content item;


comparing, in a vector space, the query vector and a plurality of target vectors corresponding to a plurality of target audio content items, to determine likelihood values indicating, for each respective target audio content item of the plurality of target audio content items, a probability that the respective target audio content item is a match for the query audio
content item; and


causing output, via a graphical user interface, of information identifying one of the
plurality of target audio content items having a highest of the likelihood values.
Claim 10:
A method of identifying an audio content item, comprising: 

receiving, by a client device, a query corresponding to a query audio content item; 

determining a query vector corresponding to the query audio content item; 


comparing, in a vector space, the query vector and a plurality of target vectors corresponding to a plurality of target audio content items, to determine likelihood values indicating, for each respective target audio content item of the plurality of target audio content items, a probability that the respective target audio content item is a match for the query audio content item, wherein the query audio content item and the target audio content items are audio stems that are configured to be inserted into an audio content item during an audio content item creation process; and 

outputting, to an editor tool, information identifying one of the plurality of target audio content items having a highest of the likelihood values.
Claim 17:
A non-transitory computer-readable medium having stored thereon one or more sequences of instructions for causing one or more processors to perform:


receiving a query corresponding to a query audio content item;


determining a query vector corresponding to the query audio content item;


comparing, in a vector space, the query vector and a plurality of target vectors corresponding to a plurality of target audio content items, to determine likelihood values indicating, for each respective target audio content item of the plurality of target audio content items, a probability that the respective target audio content item is a match for the query audio
content item; and


causing output, via a graphical user interface, of information identifying one of the
plurality of target audio content items having a highest of the likelihood values.
Claim 18:
A non-transitory computer-readable medium having stored thereon one or more sequences of instructions for causing one or more processors to perform: 

receiving, by a client device, a query corresponding to a query audio content item; 

determining a query vector corresponding to the query audio content item; 





comparing, in a vector space, the query vector and a plurality of target vectors corresponding to a plurality of target audio content items, to determine likelihood values indicating, for each respective target audio content item of the plurality of target audio content items, a probability that the respective target audio content item is a match for the query audio content item, wherein the query audio content item and the target audio content items are audio stems that are configured to be inserted into an audio content item during an audio content item creation process; and 

outputting, to an editor tool, information identifying one of the plurality of target audio content items having a highest of the likelihood values.


Sull et al, US 20020069218, suggests outputting, to an editor tool, information …. [Sull: Paragraphs 31 and 76 (“The present invention also provides a virtual video editor in one embodiment. The virtual video editor includes a network controller constructed and arranged to access remote metafiles and remote video files and a file controller in operative connection to the network controller and constructed and arranged to access local metafiles and local video files, and to access the remote metafiles and the remote video files via the network controller”)].


Examiner’s Note
4.	Vector space (According to Google): “A vector space is a mathematical structure consisting of a set of elements (vectors) that can be added together and multiplied by scalars (numbers, usually real or complex). It must satisfy ten axioms, including closure under addition and scalar multiplication, commutativity, associativity, and the existence of zero and inverse vectors.”

Ellis et al, US 20130226957, [Ellis: Abstract and paragraph 6 (“identifying, using at least one hardware processor, a query song vector for the query song, wherein the query song vector is indicative of a two-dimensional Fourier transform based on the query song; identifying a plurality of reference song vectors that each correspond to one of a plurality of reference songs”)] [Ellis: Paragraphs 6 and 19 (“determining a distance between the query song vector and each of the plurality of reference song vectors; generating an indication that a reference song corresponding to a reference song vector with a shortest distance to the query song vector is a similar song to the query song”, i.e., comparing, in a vector space, the query vector and a plurality of target vectors … a probability that the respective target audio content item is a match ..)] [Ellis: Paragraph 76 (“the distance between the query song vector and each of the reference song vectors can be found, and a predetermined number of reference songs with the smallest distance (e.g., one song, fifty songs, all reference songs, etc.) can be kept as similar songs”, i.e.,’ having a highest of the likelihood values’)] [Ellis: Paragraphs 51 and 52 (“distance between two vectors in the multi-dimensional space”, i.e., ‘in a vector space’)].


Schnitzer, US 20110004642, [Schnitzer: Title and Abstract (“identifying similar audio tracks”)] [Schnitzer: Paragraph 22 (“as may all tracks having a vector within a distance being a percentage of e.g. a maximum distance to all vectors or within a median of the distribution of distances”, i.e., ‘having a highest of the likelihood values’)] [Schnitzer: Paragraph 127 (“To use FastMap to quickly process music recommendation queries, we initially use it to map the Gaussian timbre models to k-dimensional vectors. In a two step filter-and-refine process we then use those vectors as a prefilter: given a query object we first filter the whole collection in the vector space (with the squared Euclidean distance) to return a number (filter-size) of possible nearest neighbours.”, i.e., ‘in a vector space’)].

Claim Rejections - 35 USC § 102
5.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
6.	The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


7.	Claims 1-20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Ellis et al (US 20130226957).
Claim 1:
	Ellis suggests a computer system, comprising: at least one processor; and at least one memory storing instructions which when executed by the at least one processor cause the at least one processor to: receive a query corresponding to a query audio content item [Ellis: Abstract and paragraph 6 (“identifying, using at least one hardware processor, a query song vector for the query song,”)]. Ellis suggests determining a query vector corresponding to the query audio content item [Ellis: Abstract and paragraph 6 (“identifying, using at least one hardware processor, a query song vector for the query song, wherein the query song vector is indicative of a two-dimensional Fourier transform based on the query song; identifying a plurality of reference song vectors that each correspond to one of a plurality of reference songs”)]. Ellis suggests comparing, in a vector space, [Ellis: Paragraphs 51 and 52 (“distance between two vectors in the multi-dimensional space”, i.e., ‘in a vector space’)]the query vector and a plurality of target vectors corresponding to a plurality of target audio content items, to determine likelihood values indicating, for each respective target audio content item of the plurality of target audio content items, a probability that the respective target audio content item is a match for the query audio content item [Ellis: Paragraph 76 (“the distance between the query song vector and each of the reference song vectors can be found, and a predetermined number of reference songs with the smallest distance (e.g., one song, fifty songs, all reference songs, etc.) can be kept as similar songs”, i.e.,’ having a highest of the likelihood values’)]. Ellis suggests causing output, via a graphical user interface, of information identifying one of the plurality of target audio content items having a highest of the likelihood values [Ellis: Paragraph 76 (“the distance between the query song vector and each of the reference song vectors can be found, and a predetermined number of reference songs with the smallest distance (e.g., one song, fifty songs, all reference songs, etc.) can be kept as similar songs”, i.e.,’ having a highest of the likelihood values’)] [Ellis: Paragraphs 6 and 19 (“determining a distance between the query song vector and each of the plurality of reference song vectors; generating an indication that a reference song corresponding to a reference song vector with a shortest distance to the query song vector is a similar song to the query song”, i.e., comparing, in a vector space, the query vector and a plurality of target vectors … a probability that the respective target audio content item is a match ..)].
Claim 2:’
	Ellis suggests wherein the instructions, when executed by the at least one processor, further cause the at least one processor to: generate another audio content item based on the query audio content item and the information [Ellis: Paragraph 170 (“In other words, the search results may be personalized for the querying user based on, for example, social-graph information, user information, search or browsing history of the user, or other suitable information related to the user.”)]; and cause playback of the another audio content item [Ellis: Paragraph 41 (“a song being played can be identified (e.g., the mechanisms described herein can allow a user to identify the name of a song on the radio, or the name of a song being played live by the original performer or another performer, such as a cover band)”)].
Claim 3:’
	Ellis suggests wherein the query includes a first audio type for the query audio content item; wherein the plurality of target audio content items have at least one second audio type;
And wherein the first audio type is different from the at least one second audio type [Ellis: Paragraph 142 (“A song can be received in any suitable format. For example, in some embodiments, die song can be received as: analog audio data; a bit stream of digital audio data; a file formatted in an uncompressed file format such as Waveform Audio File Format (WAV), Audio interchange File Format (AIFF), or the like; a file formatted using a compression format featuring lossless compression such as MPEG-4 SLS format. Free Lossless Audio Codec (FLAC) format, or the like; a file formatted using a compression format featuring lossy compression such as MP3, Advanced Audio Coding (AAC), or the like; or any other suitable format”)].
Claim 4:’
	Ellis suggests wherein the query includes the at least one second audio type for the plurality of target audio content items [Ellis: Paragraph 142 (“A song can be received in any suitable format. For example, in some embodiments, die song can be received as: analog audio data; a bit stream of digital audio data; a file formatted in an uncompressed file format such as Waveform Audio File Format (WAV), Audio interchange File Format (AIFF), or the like; a file formatted using a compression format featuring lossless compression such as MPEG-4 SLS format. Free Lossless Audio Codec (FLAC) format, or the like; a file formatted using a compression format featuring lossy compression such as MP3, Advanced Audio Coding (AAC), or the like; or any other suitable format”)];
Claim 5:’
	Ellis suggests wherein the first audio type is one of vocals or instrumentals; and wherein the at least one second audio type is the other of vocals or instrumentals [Ellis: Paragraph 133 (“a brief example of singing plus guitar.”)].
Claim 6:’
	Ellis suggests wherein the query vector and the plurality of target vectors are acoustic feature vectors [Ellis: Paragraph 133 (“a brief example of singing plus guitar.”)] [Ellis: Paragraph 148 (“acoustic evidence of any beats”)].
Claim 7:’
	Ellis suggests wherein acoustic features of the acoustic feature vectors include one or more of: vibration, distortion, presence of a vocoder, energy, valance, signal amplitude, or time-frequency progression [Ellis: Paragraphs 131 and 133 (“Peaks in the onset envelope 1104 correspond to times when there are significant energy onsets across multiple bands in the signal.”)].
Claim 8:’
	Ellis suggests wherein the information identifies two or more of the plurality of target audio content items having the highest of the likelihood values [Ellis: Paragraph 76 (“the distance between the query song vector and each of the reference song vectors can be found, and a predetermined number of reference songs with the smallest distance (e.g., one song, fifty songs, all reference songs, etc.) can be kept as similar songs”, i.e.,’ having a highest of the likelihood values’)].
Claim 9:
Claim 9 is essentially the same as claim 1 except that it sets forth the claimed invention as a method rather than a system and rejected under the same reasons as applied above.
Claim 10:
Claim 10 is essentially the same as claim 2 except that it sets forth the claimed invention as a method rather than a system and rejected under the same reasons as applied above.
Claim 11:
Claim 11 is essentially the same as claim 3 except that it sets forth the claimed invention as a method rather than a system and rejected under the same reasons as applied above.

Claim 12:
Claim 12 is essentially the same as claim 4 except that it sets forth the claimed invention as a method rather than a system and rejected under the same reasons as applied above.
Claim 13:
Claim 13 is essentially the same as claim 5 except that it sets forth the claimed invention as a method rather than a system and rejected under the same reasons as applied above.
Claim 14:
Claim 14 is essentially the same as claim 6 except that it sets forth the claimed invention as a method rather than a system and rejected under the same reasons as applied above.
Claim 15:
Claim 15 is essentially the same as claim 7 except that it sets forth the claimed invention as a method rather than a system and rejected under the same reasons as applied above.
Claim 16:
Claim 16 is essentially the same as claim 8 except that it sets forth the claimed invention as a method rather than a system and rejected under the same reasons as applied above.
Claim 17:
Claim 17 is essentially the same as claim 1 except that it sets forth the claimed invention as a program product rather than a system and rejected under the same reasons as applied above.
Claim 18:
Claim 18 is essentially the same as claim 2 except that it sets forth the claimed invention as a program product rather than a system and rejected under the same reasons as applied above.
Claim 19:
Claim 19 is essentially the same as claim 3 except that it sets forth the claimed invention as a program product rather than a system and rejected under the same reasons as applied above.
Claim 20:
Claim 20 is essentially the same as claim 4 except that it sets forth the claimed invention as a program product rather than a system and rejected under the same reasons as applied above.


8.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to [Hung D. Le], whose telephone number is [571-270-1404].  The examiner can normally be communicated on [Monday to Friday: 9:00 A.M. to 5:00 P.M.]. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Apu Mofiz can be reached on [571-272-4080].  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
     Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, contact [800-786-9199 (IN USA OR CANADA) or 571-272-1000].





Hung Le
02/02/2026

/HUNG D LE/Primary Examiner, Art Unit 2161

Read full office action

Prosecution Timeline

Mar 20, 2025

Application Filed

Feb 02, 2026

Non-Final Rejection — §102, §DP (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/464,188

Patent 12596684

SYSTEMS AND METHODS FOR SEARCHING DEDUPLICATED DATA

2y 5m to grant Granted Apr 07, 2026

18/738,469

Patent 12596724

SYSTEMS AND METHODS FOR USE IN REPLICATING DATA

2y 5m to grant Granted Apr 07, 2026

18/962,656

Patent 12596736

SYSTEMS AND METHODS FOR USING PROMPT DISSECTION FOR LARGE LANGUAGE MODELS

2y 5m to grant Granted Apr 07, 2026

18/060,184

Patent 12591489

POINT-IN-TIME DATA COPY IN A DISTRIBUTED SYSTEM

2y 5m to grant Granted Mar 31, 2026

18/742,058

Patent 12585625

SYSTEM AND METHOD FOR IMPLEMENTING A DATA QUALITY FRAMEWORK AND ENGINE

2y 5m to grant Granted Mar 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

90%

Grant Probability

97%

With Interview (+6.4%)

2y 6m

Median Time to Grant

Low

PTA Risk

Based on 1073 resolved cases by this examiner. Grant probability derived from career allow rate.

AUDIO STEM IDENTIFICATION SYSTEMS AND METHODS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email