Last updated: April 19, 2026
Application No. 18/737,704
CHARACTERISTIC-BASED MEDIA ANALYSIS AND SEARCH

Final Rejection §101§102§103
Filed
Jun 07, 2024
Examiner
ALLEN, BRITTANY N
Art Unit
2169
Tech Center
2100 — Computer Architecture & Software
Assignee
Incantio Inc.
OA Round
2 (Final)
This examiner grants 42% of cases after interview

— +37.7% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 391 resolved cases, 2023–2026
Examiner Intelligence

ALLEN, BRITTANY N View full profile →
Grants 42% of resolved cases
Career Allow Rate
163 granted / 391 resolved
-13.3% vs TC avg
Strong +38% interview lift
Without
With
+37.7%
Interview Lift
resolved cases with interview
Typical timeline
4y 8m
Avg Prosecution
31 currently pending
Career history
422
Total Applications
across all art units
Statute-Specific Performance

§101
17.5%
-22.5% vs TC avg
§103
52.8%
+12.8% vs TC avg
§102
12.3%
-27.7% vs TC avg
§112
13.6%
-26.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 391 resolved cases
Office Action

§101 §102 §103
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Remarks
This action is in response to the amendments received on 1/21/26.  Claims 1-20 are pending in the application.  Applicant’s arguments have been carefully and respectfully considered.
Claims 1-20 are rejected under 35 U.S.C. 101.
Claims 1-3, 5, 8-11, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Lee et al. (US 2021/0294840), and further in view of Venti et al. (US 12,347,409).
Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Lee, and further in view of Wold (US 11,816,151).
Claims 6 and 7 are rejected under 35 U.S.C. 103 as being unpatentable over Lee, and further in view of Meng et al. (US 10,885,091).
Claims 12, 14, 16, 17, and 19 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Bertin-Mahieux et al. (US 2013/0091167).
Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Bertin-Mahieux, and further in view of Lee et al. (US 2021/0294840).
Claims 15 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Bertin-Mahieux, and further in view of Meng et al. (US 10,885,091).

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to abstract idea without significantly more. 
	Step 2A, Prong One asks: Is the claim directed to a law of nature, a natural phenomenon (product of nature) or an abstract idea? See MPEP 2106.04 Part I. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.  See MPEP 2106.04(a).
With respect to claims 1 and 20, the limitation of “determining a characterization of acoustic content and emotive content of audio of an input media”, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, “determining” in the context of this claim encompasses the user mentally characterizing media. Similarly, the limitations of “identifying one or more matched media that are a match to the input media” and “searching a set of media”, as drafted, are a process that, under their broadest reasonable interpretation, covers performance of the limitation in the mind. For example, “identifying” and “searching” in the context of this claim encompasses the user mentally deciding matching media. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
At step 2a, prong two, this judicial exception is not integrated into a practical application.  Claim 20 recites a processor to execute the operations, however, this is recited as a high-level of generality (i.e., as a generic processor performing a generic computer function) such that it amounts to no more than mere instructions to apply the exception using a generic computer component.  Additionally, the claim recites “outputting an identification.”  These elements do not integrate the abstract idea into a practical application because they do not impose a meaningful limit on the judicial exception and provide only insignificant extra solution activity that is mere data gathering in conjunction with the abstract idea.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional elements amount to no more than mere instructions to apply an exception using generic computer components.  Mere instructions to apply an exception using generic computer components cannot provide an inventive concept.
	
With respect to “outputting an identification”, the courts have found limitations directed towards output to be well-understood, routine, and conventional.  See MPEP 2106.05(d)(II).  Presenting offers and gathering statistics, OIP Techs., 788 F.3d at 1362-63, 115 USPQ2d at 1092-93.

Considering the additional elements individually and in combination and the claim as a whole, the additional elements do not provide significantly more than the abstract idea.  The claim is not patent eligible.

	
	With respect to claims 2-5 and 8-11, the limitations further define the above identified abstract ideas and do not provide additional elements that are significantly more than the abstract idea.

With respect to claims 6 and 7, the limitations are directed towards receiving identification by query, translating the query, and obtaining the audio.  The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional elements amount to no more than mere instructions to apply an exception using generic computer components.  Mere instructions to apply an exception using generic computer components cannot provide an inventive concept.
	With respect to “translating, by a machine learning model, the natural language query”, under its broadest reasonable interpretation, this covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind.  The addition of “a machine learning model” adds only mere instructions to apply the exception using a generic computer component.
	With respect to “receiving” and “obtaining”, the courts have found limitations directed towards data gathering to be well-understood, routine, and conventional.  See MPEP 2106.05(d)(II).  Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information).

	
With respect to claim 12, the limitation of “determining a characterization of acoustic content and emotive content of audio of an input media”, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, “determining” in the context of this claim encompasses the user mentally characterizing media. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
At step 2a, prong two, this judicial exception is not integrated into a practical application.  Claim 20 recites a processor to execute the operations, however, this is recited as a high-level of generality (i.e., as a generic processor performing a generic computer function) such that it amounts to no more than mere instructions to apply the exception using a generic computer component.  Additionally, the claim recites “obtaining an input media” and “storing an indicator.”  These elements do not integrate the abstract idea into a practical application because they do not impose a meaningful limit on the judicial exception and provide only insignificant extra solution activity that is mere data gathering in conjunction with the abstract idea.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional elements amount to no more than mere instructions to apply an exception using generic computer components.  Mere instructions to apply an exception using generic computer components cannot provide an inventive concept.
	
	With respect to “obtaining an input media”, the courts have found limitations directed towards data gathering to be well-understood, routine, and conventional.  See MPEP 2106.05(d)(II).  Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information).
With respect to “storing an indicator”, the courts have found limitations directed towards storing to be well-understood, routine, and conventional.  See MPEP 2106.05(d)(II).  Electronic recordkeeping, Alice Corp. Pty. Ltd. v. CLS Bank Int'l, 573 U.S. 208, 225, 110 USPQ2d 1984 (2014) (creating and maintaining "shadow accounts") and “storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015).
Considering the additional elements individually and in combination and the claim as a whole, the additional elements do not provide significantly more than the abstract idea.  The claim is not patent eligible.

	
With respect to claims 13-19, the limitations further define the above identified abstract ideas and do not provide additional elements that are significantly more than the abstract idea.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 12, 14, 16, 17, and 19 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Bertin-Mahieux et al. (US 2013/0091167).

With respect to claim 12, Bertin-Mahieux teaches at least one computer-readable storage medium having encoded thereon executable instructions that, when executed by at least one processor, cause the at least one processor to carry out a method, the method comprising: 
obtaining an input media, the input media comprising audio (Bertin-Mahieux, pa 0028, receive a song or a portion of a song); 
determining a characterization of acoustic content and emotive content of audio of the input media based on a plurality of musical attributes extracted from the audio of the input media (Bertin-Mahieux, pa 0029, In some embodiments, chroma vectors can be extracted from the song in accordance with musical segments of the song or based on the time periods. A chroma vector can be characterized as having a bin that corresponds to each of twelve semitones (e.g., piano keys) within an octave formed by folding all octaves together ( e.g., putting, the intensity of semitone A across all octaves in the same semitone bin 1, putting the intensity of semitone B across all octaves in the same semitone bin 2, putting the intensity of semitone C across all octaves in the same semitone bin 3, etc.) examiner note: acoustic content is represented by the semitones and emotive content is represented by the intensity of the semitones throughout the song); and 
storing an indicator of the input media in association with the characterization of acoustic content and emotive content of the audio (Bertin-Mahieux, pa 0033, landmarks can be found from an array of normalized (and/or averaged) beat-synchronized chroma vectors (e.g., a normalized, beat-synchronized chroma matrix). Landmarks can represent prominent pitch information from a chroma vector. For example, if a semitone corresponding to bin 1 is a prominent semitone in a particular chroma vector, this can be set as a landmark. & pa 0050, the landmarks can be stored in a database in association with the song that the landmarks are derived from.).

With respect to claim 14, Bertin-Mahieux teaches the at least one computer-readable storage medium of claim 12, wherein storing the indicator of the input media further comprises storing attribute-time pairs in association with the characterization of acoustic content and emotive content of the audio, each attribute-time pair indicating a time in the input media at which a musical attribute in the plurality of musical attributes changes (Bertin-Mahieux, pa 0066, a musical event can include each time there is a change in pitch in the song. In some embodiments, whenever there is a musical event, a chroma vector can be calculated. & pa 0071, the landmarks identified at 406 can be stored and the location of the stored landmarks within the song can be identified with a set of coordinates (e.g., (time, chroma)), where time is the time frame location of the landmark, and chroma is the chroma bin of the landmark.).

With respect to claim 16, Bertin-Mahieux teaches the at least one computer-readable storage medium of claim 12, wherein the plurality of music attributes includes one or more of: musical tempo; musical key (Bertin-Mahieux, pa 0029, A chroma vector can be characterized as having a bin that corresponds to each of twelve semitones (e.g., piano keys) within an octave formed by folding all octaves together ( e.g., putting, the intensity of semitone A across all octaves in the same semitone bin 1, putting the intensity of semitone B across all octaves in the same semitone bin 2, putting the intensity of semitone C across all octaves in the same semitone bin 3, etc.)); presence of vocals; musical complexity; positivity; genre; instruments used; place of composition; or stylistic era.

With respect to claim 17, Bertin-Mahieux teaches the at least one computer-readable storage medium of claim 12, wherein the characterization of the emotive content of the audio is determined based on one or more of musical prosody, lyrical content, melodic key, or harmonic structure (Bertin-Mahieux, pa 0029, A chroma vector can be characterized as having a bin that corresponds to each of twelve semitones (e.g., piano keys) within an octave formed by folding all octaves together ( e.g., putting, the intensity of semitone A across all octaves in the same semitone bin 1, putting the intensity of semitone B across all octaves in the same semitone bin 2, putting the intensity of semitone C across all octaves in the same semitone bin 3, etc.)).

With respect to claim 19, Bertin-Mahieux teaches the at least one computer-readable storage medium of claim 12, wherein determining the characterization of the acoustic content and emotive content of audio of an input media comprises determining a characterization of a segment of the input media (Bertin-Mahieux, pa 0030, Chroma vectors can be extracted from … a portion of a song).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5, 8-11, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Lee et al. (US 2021/0294840), and further in view of Venti et al. (US 12,347,409).

With respect to claim 1, Lee teaches a method comprising: 
determining a characterization of acoustic content … of audio of an input media based at least in part on a plurality of music attributes of the input media (Lee, pa 0145, the neural network module 304 can determine a portion of the query music file 312 based on the characteristics of the music content of the query music file 312, such as spectral characteristics, temporal characteristic, amplitude characteristics, dynamic range, etc., and then generate a feature vector from the determined portion of the query music file 312); 
identifying one or more matched media that are a match to the input media, wherein identifying the one or more matched media comprises searching a set of media based on the characterization of acoustic content and emotive content of the input media to identify media having a matching acoustic content and a matching emotive content (Lee, pa 0152, The distance module 308 can be implemented to determine distances between feature vectors, such as distances between the feature vector for the query music file 312 and feature vectors for the music files to search over. & pa 0160, the music retrieval module 306 can determine the music files that are closest to the query music file 312 based on the distances with the smallest values, and return these music files as search results.); and 
outputting an identification of the one or more matched media as potential matches to the input media with respect to acoustic content and emotive content (Lee, pa 0160, The music retrieval module 306 can return music files based on the ranked, ordered list, such as by returning a top number (e.g., top ten) of the music files in the ranked, ordered list, the top number corresponding to the music files with the smallest distances.).
	Lee doesn't expressly discuss determining a characterization of acoustic content and emotive content of audio of an input media based at least in part on a plurality of music attributes of the input media and identifying one or more matched media that are a match to the input media, wherein identifying the one or more matched media comprises searching a set of media based on the characterization of acoustic content and emotive content of the input media.
Venti teaches determining a characterization of acoustic content and emotive content of audio of an input media based at least in part on a plurality of music attributes of the input media (Venti, Col. 5 Li. 59-64, The segmented portions 32 are analyzed to identify it's musical quality, e.g., the character or feel of the segment 32, such that it can be categorized with one or more particular emotion, style, or vibe attributes, i.e., perceived or anticipated emotional reactions, moods, or perceived atmospheres that a human would have when hearing the segment & Col. 6 Li. 6-11, There may be many types of musical qualities which can be identified with a segment, such as, for example, low intensity or low energy, high intensity, or high energy, high or low stress, an intensity slope that increases, decreases, or is constant, emotion state, emotion vector, music complexity consistency, dynamic music consistency, etc.);
identifying one or more matched media that are a match to the input media, wherein identifying the one or more matched media comprises searching a set of media based on the characterization of acoustic content and emotive content of the input media to identify media having a matching acoustic content and a matching emotive content (Venti, Col. 7 Li. 5-11, the system iteratively searches for additional digital music files which have segments which match the emotion, style, or vibe attribute of at least one segmented portion of the first or second digital music files, such that the composite soundtrack can be constructed from segmented portions of the first, second, and additional digital music files.).
It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Lee with the teachings of Venti because it allows audio to be found according to the desired emotion, style, or vibe that is desired (Venti, Col. 7 Li. 22-26).

With respect to claim 2, Lee teaches the method of claim 1, wherein the plurality of music attributes includes one or more of: musical tempo; musical key; presence of vocals; musical complexity; positivity; genre; instruments used; place of composition; or stylistic era (Lee, pa 0141, the query music file 312 has musical attributes including a genre attribute of type "X" (e.g., type "X" may designate the genre of the query music file 312 as "classic rock"), an instrument attribute of type "Y" (e.g., type "Y" may designate that the query music file 312 includes a piano and vocals), a mood attribute of type "Z" (e.g., type "Z" may designate the mood of the query music file 312 as "happy"), and a tempo attribute of type "Z", pa 0142, The user may provide user input to the user interface module 302 that includes a selection of one or more of the musical attributes of genre, instruments, mood, and tempo.& pa 0145, the neural network module 304 can determine a portion of the query music file 312 based on the characteristics of the music content of the query music file 312, such as spectral characteristics, temporal characteristic, amplitude characteristics, dynamic range, etc.).

With respect to claim 3, Lee teaches the method of claim 1, wherein the characterization of the emotive content of the audio is determined based on one or more of musical prosody, lyrical content, melodic key, or harmonic structure (Lee, pa 0145, the neural network module 304 can determine a portion of the query music file 312 based on the characteristics of the music content of the query music file 312, such as spectral characteristics, temporal characteristic, amplitude characteristics, dynamic range, etc.).

With respect to claim 5, Lee teaches the method of claim 1, wherein: obtaining the input media comprises obtaining a plurality of input media; and determining the characterization of the input media comprises determining an aggregate characterization of the acoustic content and the emotive content of the plurality of input media, wherein determining the aggregate characterization comprises evaluating at least some the plurality of music attributes for each media of the plurality of input media (Lee, pa 0146, the neural network module 304 generates a feature vector for the query music file 312 by averaging multiple feature vectors of the query music file 312. For example, the neural network module 304 can generate multiple feature vectors of the query music file 312 for different and non-overlapping sections of the query music file 312, such as different versions of a chorus or bridge throughout the query music file 312, randomly sampled sections of the query music file 312, or contiguous sections of the query music file 312. The neural network module 304 can then form an average of the multiple feature vectors to form the feature vector for the query music file).

With respect to claim 8, Lee teaches the method of claim 1, wherein determining the characterization of the acoustic content and the emotive content of audio of an input media comprises determining a characterization of a segment of the input media (Lee, pa 0145, the neural network module 304 can randomly determine a portion of the query music file 312 and generate a feature vector from the randomly determined portion. In one example, the neural network module 304 determines a portion of the query music file 312 using an automatic music segmentation system that splits a song into segments. The portion of the query music file 312 can be determined as one of the segments of the song produced automatically from the music segmentation system.).

With respect to claim 9, Lee teaches the method of claim 8, wherein determining the characterization of the segment further comprises refraining from determining characterizations of other segments of the input media (Lee, pa 0145, The neural network module 304 can generate a feature vector for the query music file 312 from any suitable portion of the query music file 312, such as a three second duration of the query music file 312. For example, the neural network module 304 can randomly determine a portion of the query music file 312 and generate a feature vector from the randomly determined portion).

With respect to claim 10, Lee teaches the method of claim 1, wherein outputting the identification of the one or more matched media comprises returning one or segments of the one or more matched media (Lee, pa 0160, The music retrieval module 306 can return music files based on the ranked, ordered list, such as by returning a top number (e.g., top ten) of the music files in the ranked, ordered list, the top number corresponding to the music files with the smallest distances.).

With respect to claim 11, Lee teaches the method of claim 1, wherein determining the characterization of the acoustic content and the emotive content of audio of the input media comprises determining the characterization of a plurality of segments of one or more input media (Lee, pa 0146, the neural network module 304 generates a feature vector for the query music file 312 by averaging multiple feature vectors of the query music file 312. For example, the neural network module 304 can generate multiple feature vectors of the query music file 312 for different and non-overlapping sections of the query music file 312, such as different versions of a chorus or bridge throughout the query music file 312, randomly sampled sections of the query music file 312, or contiguous sections of the query music file 312. The neural network module 304 can then form an average of the multiple feature vectors to form the feature vector for the query music file).

With respect to claim 20, Lee teaches an apparatus comprising: at least one processor; and at least one storage medium similar to claim 1, and is rejected as discussed above.

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Lee, and further in view of Wold (US 11,816,151).

With respect to claim 4, Lee teaches the method of claim 1, as discussed above.  Lee doesn't expressly discuss wherein the input media comprises video and the audio.
Wold teaches wherein the input media comprises video and the audio (Wold, Col. 2 Li. 57-59, A media content item may be audio (e.g., a song or album), a video).
	It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Lee with the teachings of Wold because it is a popular form of entertainment for users (Wold, Col. 1 Li. 13-16).

Claims 6 and 7 are rejected under 35 U.S.C. 103 as being unpatentable over Lee, and further in view of Meng et al. (US 10,885,091).

With respect to claim 6, Lee teaches the method of claim 1, as discussed above.  Lee doesn't expressly discuss receiving an identification of the input media and obtaining the audio of the input media based on the identification of the input media.
Meng teaches receiving an identification of the input media (Meng, Col. 9 Li. 35-36, users may submit utterances that may include various commands, requests, and the like); and 
obtaining the audio of the input media based on the identification of the input media (Meng, Col. 9 Li. 55-58, the result can be provided to the media service to refine and/or initiate playback of media content using the voice communications device.).
	It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Lee with the teachings of Meng because it allows a backend server to analyze the identification and provide audio without the user directly inputting audio (Meng, Col. 6 Li. 1-27).

With respect to claim 7, Lee in view of Meng teaches the method of claim 6, wherein receiving the identification of the input media comprises: receiving a natural language query for an item of media (Meng, Col. 6 Li. 55-60, In response to the user 102 speaking the phrase "Alexa, play music I like," audio input data 106 that includes the phrase is received at the voice communications device 104 and an application executing on the voice communications device or otherwise in communication with the voice communications device can analyze the audio input data 106.); and translating, by a machine learning model, the natural language query into the identification of the input media (Meng, Col. 6 Li. 1-23, the backend server can start analyzing whatever portion of the audio input data it received through a variety of techniques such as automatic speech recognition (ASR) and natural language understanding (NLU) to convert the audio input data into a series of identifiable words, and then to analyze those words … The backend server can then cause music associated with the initialization information to be played using the voice communications device).

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Bertin-Mahieux, and further in view of Lee et al. (US 2021/0294840).

With respect to claim 13, Bertin-Mahieux teaches at least one computer-readable storage medium of claim 12, as discussed above.  Bertin-Mahieux doesn't expressly discuss wherein storing the indicator of the input media further comprises storing the indicator of the input media storing in association with at least one of: a spectrogram of the audio. 
Lee teaches wherein storing the indicator of the input media further comprises storing the indicator of the input media storing in association with at least one of: a spectrogram of the audio (Lee, pa 0081, The search panel 108 also includes a representation of a query music file 112, which is an example of a music file or audio file that can be loaded into the music search system 104 by a user for searching for music files. For instance, the user may provide the query music file 112 to the music search system 104 to search for music files that are perceptually similar to the query music file 112. The user interface 106 can display any suitable representation of the query music file 112 in the search panel 108, such as a spectrogram); a chromogram of the audio; lyrics of the input media; or licensing terms of the input media.
	It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Bertin-Mahieux with the teachings of Lee because it provides a detailed representation of the content of a music file (Lee, pa 0068). 

Claims 15 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Bertin-Mahieux, and further in view of Meng et al. (US 10,885,091).

With respect to claim 15, Bertin-Mahieux teaches the at least one computer-readable storage medium of claim 12, as discussed above.  Bertin-Mahieux doesn't expressly discuss wherein obtaining the input media comprises retrieving the input media from a published media data set.
Meng teaches wherein obtaining the input media comprises retrieving the input media from a published media data set (Meng, Col. 6 Li 66- Col. 7 Li. 13, The media service 210 can correspond to an online service that provides access to media content, such as music, e-books, audio broadcasts, etc. In one example, the media service 210 can be associated with an online electronic marketplace that provides media content. Moreover, in some embodiments, the media service 210 can comprise one or more media libraries or databases 212. … the one or more media libraries 212 can reside on one or more servers external to one or more servers on which the media service 210 resides. For example, the media libraries can be stored in media content data store 217 provided by content provider 215 & Col. 7 Li. 19-22, The voice communications device 204 can acquire (e.g., download, stream, etc.) the data from the media service 210 and/or content provider 215 and, as a result, play the media content).
	It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Bertin-Mahieux with the teachings of Meng because it provides compatibility with online media services.

With respect to claim 18, Bertin-Mahieux teaches the at least one computer-readable storage medium of claim 12, as discussed above.  Bertin-Mahieux doesn't expressly discuss receiving an identification of the input media and obtaining the audio of the input media based on the identification of the input media.
Meng teaches receiving an identification of the input media (Meng, Col. 9 Li. 35-36, users may submit utterances that may include various commands, requests, and the like); and 
obtaining the audio of the input media based on the identification of the input media (Meng, Col. 9 Li. 55-58, the result can be provided to the media service to refine and/or initiate playback of media content using the voice communications device.).
	It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Bertin-Mahieux with the teachings of Meng because it allows a backend server to analyze the identification and provide audio without the user directly inputting audio (Meng, Col. 6 Li. 1-27).

Response to Arguments
35 U.S.C. 103
Applicant argues that Bertin-Mahieux does not teach “determining a characterization of acoustic content and emotive content of audio of the input media based on a plurality of musical attributes extracted from the audio of the input media” because Bertin-Mahieux does not disclose characterization of emotive content. The Examiner respectfully disagrees. The specification states, “Emotive content may include musical structures and/or lyrical structures tailored to evoke a particular emotion or emotions in a listener.” Bertin-Mahieux describes extracting chroma vectors from a song that reflect the pitch of the song segment (pa 0029). The chroma vectors can be used to identify landmarks that indicate prominent semitones in a particular chroma vector (pa 0033). Characterizing the chroma vectors created from extracted attributes of a song provides determining a characterization of acoustic content and emotive content because the characterization is based on the semitones and structure of the song. The particular emotion that it brings about an listener is subjective. The particular structure of the song is reflected through the analysis of the intensity of different semitones throughout the song. Additionally, the claims require determining “a characterization of acoustic content and emotive content.” This recites a single characterization based on both acoustic content and emotive content. The chroma vector provides an indication of both the semitones and the intensity of the semitones for the particular song. Therefore, the characterization of the chroma vector provides “determining a characterization of acoustic content and emotive content of audio of the input media based on a plurality of musical attributes extracted from the audio of the input media.”

With respect to claims 1 and 20, Applicant seems to argue a newly amended limitation.  Applicant’s amendment has rendered the previous rejection moot.  Upon further consideration of the amendment, a new grounds of rejection is made in view of Venti et al. (US 12,347,409).

35 U.S.C. 101
Applicant argues that “searching a set of media based on the characterization of acoustic content and emotive content of the input media to identify media having a matching acoustic content and a matching emotive content” cannot practically be performed in the mind.  The Examiner respectfully disagrees. The user can mentally search for a set of media by simply analyzing data based on known information about that media. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Applicant argues that the specification describes an improvement to technology or technical field and thus is patent eligible. The Examiner respectfully disagrees. It is not clear how these discussions in the specification are reflected in the claim language. Considering the additional elements individually and in combination and the claim as a whole, the additional elements do not provide significantly more than the abstract idea.  The claim is not patent eligible.




Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRITTANY N ALLEN whose telephone number is (571)270-3566. The examiner can normally be reached M-F 9 am - 5:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sherief Badawi can be reached at 571-272-9782. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/BRITTANY N ALLEN/           Primary Examiner, Art Unit 2169
Read full office action
Prosecution Timeline

Jun 07, 2024
Application Filed
Jul 17, 2025
Non-Final Rejection — §101, §102, §103
Jan 21, 2026
Response Filed
Feb 24, 2026
Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/851,506
Patent 12585707
SYSTEMS AND METHODS FOR DOCUMENT ANALYSIS TO PRODUCE, CONSUME AND ANALYZE CONTENT-BY-EXAMPLE LOGS FOR DOCUMENTS
2y 5m to grant Granted Mar 24, 2026
17/978,752
Patent 12561342
MULTI-REGION DATABASE SYSTEMS AND METHODS
2y 5m to grant Granted Feb 24, 2026
18/749,683
Patent 12530391
Digital Duplicate
2y 5m to grant Granted Jan 20, 2026
18/375,735
Patent 12524389
ENTERPRISE ENGINEERING AND CONFIGURATION FRAMEWORK FOR ADVANCED PROCESS CONTROL AND MONITORING SYSTEMS
2y 5m to grant Granted Jan 13, 2026
19/054,190
Patent 12524475
CONCEPTUAL CALCULATOR SYSTEM AND METHOD
2y 5m to grant Granted Jan 13, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
42%
Grant Probability
79%
With Interview (+37.7%)
4y 8m
Median Time to Grant
Moderate
PTA Risk
Based on 391 resolved cases by this examiner. Grant probability derived from career allow rate.