DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Response to Arguments
Applicant’s arguments, see pages 11-14, filed 1/22/2026, with respect to claims 21-40, have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 21 and 39-40 are rejected under 35 U.S.C. 103 as unpatentable over Sheppard (US 20190164528 A1, May 30, 2019), hereinafter Sheppard, in view of Roblek et al. (US 20180190249 A1, July 5, 2018), hereinafter Roblek, and further in view of Fiebrink et al. ("A Meta-Instrument for Interactive, On-the-fly Machine Learning," June 6, 2009, retrieved February 6, 2026), hereinafter Fiebrink.
Regarding claim 21, Sheppard teaches an information processing apparatus comprising: memory configured to store a plurality of pieces of music feature information in which a plurality of types of feature amounts extracted from music information is associated with predetermined identification information (Sheppard ¶0118: "The remote data store may comprise music datasets, such as those shown in FIG. 2b. In FIG. 2b, a data store or music dataset 40 comprises a harmony dataset 40 a, a beats dataset 40 b, a solo dataset 40 c and an atmosphere dataset 40 d. Each of these datasets 40 a to 40 d comprise a plurality of pre-recorded items of music (or modules)."); a communication interface configured to receive instruction information (Sheppard ¶0131: "In embodiments, the music generation process may have more than one operation mode. For example, the music generation process may have an 'active' operation mode and a 'passive' operation mode… In the 'passive' operation mode, the user data may comprise user selections or user-inputted data… in which the generated music may be generated in response to user selections") transmitted from a terminal apparatus (Sheppard ¶0172: "FIG. 9 shows a schematic diagram of a graphical user interface of user device 10 used for generating music. User device 10 may comprise a display screen 50, which may, in embodiments, be a touch screen. The display screen 50 may be used to display/present a graphical user interface (GUI) when the user launches the ‘app’ to generate/compose music." ¶0174: In a passive operation mode, the GUI shown in FIG. 9 may enable a user to make user selections which are used (in addition to, or instead of, user data) to generate/compose music. The GUI may comprise further buttons which the user may use to make selections regarding the music to be generated.") that specifies selection of music feature information (Sheppard ¶0148: "FIG. 5 shows a flowchart of example steps to select pre-recorded items of music to generate/compose music. At step S90, the method to select items of music for use in the generated music comprises receiving user data. In the illustrated embodiment, the music dataset comprises a harmony dataset, a beats dataset, a solo dataset and an atmosphere dataset, which are described above with respect to FIG. 2b."); and circuitry configured to extract the music feature information from the memory according to the instruction information (Sheppard ¶0149: "using the received user data to filter all of the music datasets 40 a to 40 d (step S92)."); output presentation information of the extracted music feature information (Sheppard ¶0174: "In a passive operation mode, the GUI shown in FIG. 9 may enable a user to make user selections which are used (in addition to, or instead of, user data) to generate/compose music. The GUI may comprise further buttons which the user may use to make selections regarding the music to be generated.").
Sheppard does not explicitly disclose the music feature information being used as learning data in composition processing using machine learning, the extracted music feature information indicating characteristics of the learning data to be used in the composition processing; output presentation information of the extracted music feature information for display as a list of candidates for the learning data at the terminal apparatus; and receive, from the terminal apparatus via the communication interface, selection information specifying at least one piece of the extracted music feature information selected from the displayed list to be used as the learning data prior to initiating the composition processing.
However, Roblek suggests the music feature information being used as learning data in composition processing using machine learning, the extracted music feature information indicating characteristics of the learning data to be used in the composition processing (Roblek ¶0071: "In particular, the model trainer 160 can train a music generation model 120 or 140 or portions thereof based on a set of training data 162. In some implementations, the music generation models 120 or 140 can include deep neural network that provides a musical embedding for an input text (e.g., for each of one or more portions of the input text). In some of such implementations, deep neural network can be trained on or using a training dataset 162 that includes a plurality of sets of lyrics from humanly-generated songs, wherein each set of lyrics is annotated with one or more music features descriptive of the backing music associated with such set of lyrics in the corresponding humanly-generated song. Example music features include a tempo feature, a loudness feature, a dynamic feature, a pitch histogram feature, a dominant melodic interval feature, a direction of motion feature, and a dominant harmony feature.").
Furthermore, Fiebrink suggests output presentation information (Fiebrink § 4.1.4: "Wekinator’s 4-pane GUI is the single point of interaction with the learning system.") of the extracted music feature information (Fiebrink § 4.1.1: "The Wekinator’s built-in feature extractors for hardware controllers and audio are implemented in ChucK.") for display as a list of candidates for the learning data at the terminal apparatus (Fiebrink § 4.1.4: "Wekinator’s 4-pane GUI is the single point of interaction with the learning system. The first pane enables the setup and monitoring of OSC communication with the ChucK component of the Wekinator. The second allows the user to specify the features to use and their parameters (e.g., FFT size)."); and receive, from the terminal apparatus via the communication interface, selection information (Fiebrink § 4.1.2: "The GUI pane in Figure 2 allows the user to specify values for all output parameters simultaneously.") specifying at least one piece of the extracted music feature information selected from the displayed list (Fiebrink § 4.1.4: "Wekinator’s 4-pane GUI is the single point of interaction with the learning system. The first pane enables the setup and monitoring of OSC communication with the ChucK component of the Wekinator. The second allows the user to specify the features to use and their parameters (e.g., FFT size), create and save configurations for any custom HID devices, and optionally save and reload feature setting configurations. The third pane allows the user to specify the creation of a new model, or to reload a saved, possibly pre-trained model from a file. The fourth [pane](shown in Figure 2 for a NN) allows real-time control over training set creation, training, adjusting model parameters, and running.") to be used as the learning data (Fiebrink § 3.1: "The Wekinator enables users to rapidly and interactively control ML algorithms by choosing inputs and their features, selecting a learning algorithm and its parameters, creating training example feature/parameter pairs, training the learner, and subjectively and objectively evaluating its performance, all in real-time and in a possibly non-linear sequence of actions.") prior to initiating the composition processing (Fiebrink § 3.1: "To run a trained model, the same features as were extracted to construct the training dataset are again extracted in real-time from the input sources, but now the model computes outputs from the features. These outputs can be used to drive synthesis or compositional parameters of musical code running in real-time.").
It would have been prima facie obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to have modified the information processing apparatus of Sheppard by adding the machine learning composition model of Roblek and the user-selected music feature information of Fiebrink to personalize the model (Roblek ¶0074) and open the door to new composition paradigms (Fiebrink § 3.2).
Regarding claim 39, Sheppard teaches an information processing method comprising: receiving instruction information (Sheppard ¶0131: "In embodiments, the music generation process may have more than one operation mode. For example, the music generation process may have an 'active' operation mode and a 'passive' operation mode… In the 'passive' operation mode, the user data may comprise user selections or user-inputted data… in which the generated music may be generated in response to user selections") transmitted from a terminal apparatus (Sheppard ¶0172: "FIG. 9 shows a schematic diagram of a graphical user interface of user device 10 used for generating music. User device 10 may comprise a display screen 50, which may, in embodiments, be a touch screen. The display screen 50 may be used to display/present a graphical user interface (GUI) when the user launches the ‘app’ to generate/compose music." ¶0174: In a passive operation mode, the GUI shown in FIG. 9 may enable a user to make user selections which are used (in addition to, or instead of, user data) to generate/compose music. The GUI may comprise further buttons which the user may use to make selections regarding the music to be generated.") that specifies selection of music feature information (Sheppard ¶0148: "FIG. 5 shows a flowchart of example steps to select pre-recorded items of music to generate/compose music. At step S90, the method to select items of music for use in the generated music comprises receiving user data. In the illustrated embodiment, the music dataset comprises a harmony dataset, a beats dataset, a solo dataset and an atmosphere dataset, which are described above with respect to FIG. 2b."); extracting music feature information according to the instruction information (Sheppard ¶0149: "using the received user data to filter all of the music datasets 40 a to 40 d (step S92).") from a plurality of pieces of the music feature information in which a plurality of types of feature amounts extracted from music information is associated with predetermined identification information (Sheppard ¶0118: "The remote data store may comprise music datasets, such as those shown in FIG. 2b. In FIG. 2b, a data store or music dataset 40 comprises a harmony dataset 40 a, a beats dataset 40 b, a solo dataset 40 c and an atmosphere dataset 40 d. Each of these datasets 40 a to 40 d comprise a plurality of pre-recorded items of music (or modules).").
Sheppard does not explicitly disclose the extracted music feature information indicating characteristics of learning data to be used in composition processing using machine learning; outputting presentation information of the extracted music feature information for display as a list of candidates for the learning data at the terminal apparatus; and receiving selection information specifying at least one piece of the extracted music feature information selected from the displayed list to be used as the learning data prior to initiating the composition processing.
However, Roblek suggests: the extracted music feature information indicating characteristics of learning data to be used in composition processing using machine learning (Roblek ¶0071: "In particular, the model trainer 160 can train a music generation model 120 or 140 or portions thereof based on a set of training data 162. In some implementations, the music generation models 120 or 140 can include deep neural network that provides a musical embedding for an input text (e.g., for each of one or more portions of the input text). In some of such implementations, deep neural network can be trained on or using a training dataset 162 that includes a plurality of sets of lyrics from humanly-generated songs, wherein each set of lyrics is annotated with one or more music features descriptive of the backing music associated with such set of lyrics in the corresponding humanly-generated song. Example music features include a tempo feature, a loudness feature, a dynamic feature, a pitch histogram feature, a dominant melodic interval feature, a direction of motion feature, and a dominant harmony feature.").
Furthermore, Fiebrink suggests outputting presentation information (Fiebrink § 4.1.4: "Wekinator’s 4-pane GUI is the single point of interaction with the learning system.") of the extracted music feature information (Fiebrink § 4.1.1: "The Wekinator’s built-in feature extractors for hardware controllers and audio are implemented in ChucK.") for display as a list of candidates for the learning data at the terminal apparatus (Fiebrink § 4.1.4: "Wekinator’s 4-pane GUI is the single point of interaction with the learning system. The first pane enables the setup and monitoring of OSC communication with the ChucK component of the Wekinator. The second allows the user to specify the features to use and their parameters (e.g., FFT size)."); and receiving selection information (Fiebrink § 4.1.2: "The GUI pane in Figure 2 allows the user to specify values for all output parameters simultaneously.") specifying at least one piece of the extracted music feature information selected from the displayed list (Fiebrink § 4.1.4: "Wekinator’s 4-pane GUI is the single point of interaction with the learning system. The first pane enables the setup and monitoring of OSC communication with the ChucK component of the Wekinator. The second allows the user to specify the features to use and their parameters (e.g., FFT size), create and save configurations for any custom HID devices, and optionally save and reload feature setting configurations. The third pane allows the user to specify the creation of a new model, or to reload a saved, possibly pre-trained model from a file. The fourth [pane](shown in Figure 2 for a NN) allows real-time control over training set creation, training, adjusting model parameters, and running.") to be used as the learning data (Fiebrink § 3.1: "The Wekinator enables users to rapidly and interactively control ML algorithms by choosing inputs and their features, selecting a learning algorithm and its parameters, creating training example feature/parameter pairs, training the learner, and subjectively and objectively evaluating its performance, all in real-time and in a possibly non-linear sequence of actions.") prior to initiating the composition processing (Fiebrink § 3.1: "To run a trained model, the same features as were extracted to construct the training dataset are again extracted in real-time from the input sources, but now the model computes outputs from the features. These outputs can be used to drive synthesis or compositional parameters of musical code running in real-time.").
It would have been prima facie obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to have modified the information processing method of Sheppard by adding the machine learning composition model of Roblek and the user-selected music feature information of Fiebrink to personalize the model (Roblek ¶0074) and open the door to new composition paradigms (Fiebrink § 3.2).
Regarding claim 40, Sheppard teaches a non-transitory computer-readable medium including computer-program instructions (Sheppard ¶0056: "Embodiments of the present techniques also provide a non-transitory data carrier carrying code which, when implemented on a processor, causes the processor to carry out the methods described herein."), which when executed by circuitry, cause the circuitry to: receive instruction information (Sheppard ¶0131: "In embodiments, the music generation process may have more than one operation mode. For example, the music generation process may have an 'active' operation mode and a 'passive' operation mode… In the 'passive' operation mode, the user data may comprise user selections or user-inputted data… in which the generated music may be generated in response to user selections") transmitted from a terminal apparatus (Sheppard ¶0172: "FIG. 9 shows a schematic diagram of a graphical user interface of user device 10 used for generating music. User device 10 may comprise a display screen 50, which may, in embodiments, be a touch screen. The display screen 50 may be used to display/present a graphical user interface (GUI) when the user launches the ‘app’ to generate/compose music." ¶0174: In a passive operation mode, the GUI shown in FIG. 9 may enable a user to make user selections which are used (in addition to, or instead of, user data) to generate/compose music. The GUI may comprise further buttons which the user may use to make selections regarding the music to be generated.") that specifies selection of music feature information (Sheppard ¶0148: "FIG. 5 shows a flowchart of example steps to select pre-recorded items of music to generate/compose music. At step S90, the method to select items of music for use in the generated music comprises receiving user data. In the illustrated embodiment, the music dataset comprises a harmony dataset, a beats dataset, a solo dataset and an atmosphere dataset, which are described above with respect to FIG. 2b."); extract music feature information according to the instruction information (Sheppard ¶0149: "using the received user data to filter all of the music datasets 40 a to 40 d (step S92).") from a plurality of pieces of the music feature information in which a plurality of types of feature amounts extracted from music information is associated with predetermined identification information (Sheppard ¶0118: "The remote data store may comprise music datasets, such as those shown in FIG. 2b. In FIG. 2b, a data store or music dataset 40 comprises a harmony dataset 40 a, a beats dataset 40 b, a solo dataset 40 c and an atmosphere dataset 40 d. Each of these datasets 40 a to 40 d comprise a plurality of pre-recorded items of music (or modules)."). `Sheppard does not explicitly disclose the extracted music feature information indicating characteristics of learning data to be used in composition processing using machine learning; output presentation information of the extracted music feature information for display as a list of candidates for the learning data at the terminal apparatus; and receive selection information specifying at least one piece of the extracted music feature information selected from the displayed list to be used as the learning data prior to initiating the composition processing.
However, Roblek suggests the extracted music feature information indicating characteristics of learning data to be used in composition processing using machine learning (Roblek ¶0071: "In particular, the model trainer 160 can train a music generation model 120 or 140 or portions thereof based on a set of training data 162. In some implementations, the music generation models 120 or 140 can include deep neural network that provides a musical embedding for an input text (e.g., for each of one or more portions of the input text). In some of such implementations, deep neural network can be trained on or using a training dataset 162 that includes a plurality of sets of lyrics from humanly-generated songs, wherein each set of lyrics is annotated with one or more music features descriptive of the backing music associated with such set of lyrics in the corresponding humanly-generated song. Example music features include a tempo feature, a loudness feature, a dynamic feature, a pitch histogram feature, a dominant melodic interval feature, a direction of motion feature, and a dominant harmony feature.").
Furthermore, Fiebrink suggests output presentation information (Fiebrink § 4.1.4: "Wekinator’s 4-pane GUI is the single point of interaction with the learning system.") of the extracted music feature information (Fiebrink § 4.1.1: "The Wekinator’s built-in feature extractors for hardware controllers and audio are implemented in ChucK.") for display as a list of candidates for the learning data at the terminal apparatus (Fiebrink § 4.1.4: "Wekinator’s 4-pane GUI is the single point of interaction with the learning system. The first pane enables the setup and monitoring of OSC communication with the ChucK component of the Wekinator. The second allows the user to specify the features to use and their parameters (e.g., FFT size)."); and receive selection information (Fiebrink § 4.1.2: "The GUI pane in Figure 2 allows the user to specify values for all output parameters simultaneously.") specifying at least one piece of the extracted music feature information selected from the displayed list (Fiebrink § 4.1.4: "Wekinator’s 4-pane GUI is the single point of interaction with the learning system. The first pane enables the setup and monitoring of OSC communication with the ChucK component of the Wekinator. The second allows the user to specify the features to use and their parameters (e.g., FFT size), create and save configurations for any custom HID devices, and optionally save and reload feature setting configurations. The third pane allows the user to specify the creation of a new model, or to reload a saved, possibly pre-trained model from a file. The fourth [pane](shown in Figure 2 for a NN) allows real-time control over training set creation, training, adjusting model parameters, and running.") to be used as the learning data (Fiebrink § 3.1: "The Wekinator enables users to rapidly and interactively control ML algorithms by choosing inputs and their features, selecting a learning algorithm and its parameters, creating training example feature/parameter pairs, training the learner, and subjectively and objectively evaluating its performance, all in real-time and in a possibly non-linear sequence of actions.") prior to initiating the composition processing (Fiebrink § 3.1: "To run a trained model, the same features as were extracted to construct the training dataset are again extracted in real-time from the input sources, but now the model computes outputs from the features. These outputs can be used to drive synthesis or compositional parameters of musical code running in real-time.").
It would have been prima facie obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to have modified the non-transitory computer-readable medium of Sheppard by adding the machine learning composition model of Roblek and the user-selected music feature information of Fiebrink to personalize the model (Roblek ¶0074) and open the door to new composition paradigms (Fiebrink § 3.2).
Claims 22-38 are rejected under 35 U.S.C. 130 as unpatentable over Sheppard in view of Roblek and further in view of Fiebrink and Tomokazu et al. (JP 2016161774 A, September 5, 2016), hereinafter Tomokazu.
Regarding claim 22, Sheppard (in view of Roblek) teaches an information processing apparatus comprising the features of claim 21 as discussed above.
Sheppard further teaches that the instruction information includes information regarding the feature amounts
Sheppard (in view of Roblek) does not explicitly disclose that the circuitry is configured to rank a plurality of pieces of the music feature information by using a predetermined rule on a basis of the information regarding the feature amounts, extract the music feature information arranged in a preset rank, and output the presentation information of the extracted music feature information to an external apparatus together with ranking information indicating ranking of the music feature information.
However, Tomokazu suggests that the circuitry is configured to rank a plurality of pieces of the music feature information by using a predetermined rule on a basis of the information regarding the feature amounts (Tomokazu ¶0008: "In the above aspect, the order in which multiple songs are presented is changed, which has the advantage that multiple songs can be presented to the user with priority from various perspectives. For example, the music presentation means presents a plurality of songs in an order according to at least one of the history of music data generation by the music generation means, location information relating to the user, and the time at which the plurality of songs are presented."), extract the music feature information arranged in a preset rank (Tomokazu ¶0023: "When a user U specifies only lyrics L without specifying musical attribute, an information management unit 62 selects a predetermined number of genres (genres that have been frequently used in the past) from among a plurality of genres in descending order of frequency indicated by a generation history HB, and generates attribute information zB indicating one genre (hereinafter referred to as a 'selected genre') randomly selected from the predetermined number of genres."), and output the presentation information of the extracted music feature information to an external apparatus together with ranking information indicating ranking of the music feature information (Tomokazu ¶0015: "Specifically, the display device 24 displays a screen shown in FIG. 3 (hereinafter referred to as the 'music presentation screen'), which presents to the user U the N pieces of music for which the music creation system 10 has generated the music data D. The user U can select a desired piece of music from the N pieces of music presented on the music presentation screen by appropriately operating the operation device 23.").
It would have been prima facie obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to have modified the information processing apparatus of Sheppard (as modified by Roblek) by adding ranking and arranging using a predetermined rule of Tomokazu to increase the possibility of providing the user with a piece of music that matches the user's intentions and preferences (Tomokazu ¶0005).
Regarding claim 23, Sheppard (in view of Roblek and further in view of Tomokazu) teaches an information processing apparatus comprising the features of claim 22 as discussed above.
Sheppard further suggests that the instruction information is operation information in the terminal apparatus (Sheppard ¶0100: "The user device 10 may comprise interfaces 18, such as a conventional computer screen/display screen, keyboard, mouse, and/or other interfaces such as a network interface and software interfaces… The user interface may be configured to receive user feedback on a piece of generated music, and/or to receive user requests to modify (or save, or buy, etc.) a piece of generated music.").
Regarding claim 24, Sheppard (in view of Roblek and further in view of Tomokazu) teaches an information processing apparatus comprising the features of claim 23 as discussed above.
Sheppard further suggests that the music feature information includes score information including chord progression information indicating a chord progression (Sheppard ¶0086: "The term “harmony” or “harmony layer” is used herein to mean an arrangement of simultaneous musical pitches, tones, notes or chords, which may be played or sung at the same time (or one after the other), and which often produce a pleasing sound."), melody information indicating a melody (Sheppard ¶0088: "The term 'solo' or 'solo layer' is used herein to mean a piece of music (or a section of music) that is played or sung by a single performer, and/or to mean a melody (i.e. a sequence/arrangement of single notes that form a distinct musical phrase or idea)."), and bass information indicating a bass progression (Sheppard ¶0086: "The term 'harmony' or 'harmony layer' is used herein to mean an arrangement of simultaneous musical pitches, tones, notes or chords, which may be played or sung at the same time (or one after the other), and which often produce a pleasing sound." Sheppard's "notes" encompass all notes including bass notes, i.e. a bass progression.) in a bar having a prescribed length (Sheppard ¶0209: "Cells may be different lengths. For example, one harmony cell may be two measures long, while another harmony cell may be three measures long.").
Regarding claim 25, Sheppard (in view of Roblek and further in view of Tomokazu) teaches an information processing apparatus comprising the features of claim 24 as discussed above.
Sheppard further suggests that the score information further includes drum progression information indicating a drum progression in the bar having the prescribed length (Sheppard ¶0087: "The term 'beat' or 'beats' or 'beats layer is used herein to mean a rhythmic movement or the speed at which a piece of music is played, which typically form the pulses or extra rhythmic elements of a piece of music. The beat may be provided by the playing of a percussion instrument. For example, a beat may be the rhythmic noise played on a drum.").
Regarding claim 26, Sheppard (in view of Roblek and further in view of Tomokazu) teaches an information processing apparatus comprising the features of claim 24 as discussed above.
Tomokazu further suggests that the music feature information includes lyric information indicating lyrics (Tomokazu ¶0059: "As illustrated in FIG. 7 , lyrics L specified by a user U are divided into a plurality of parts (character strings) λ for each unit interval, and in area B1 of unit area B corresponding to any one unit interval, the part λ of the lyrics L corresponding to that unit interval is displayed.") in the bar having the prescribed length (Tomokazu ¶0059: "The unit section is, for example, a section having a time length equivalent to a predetermined number of bars.").
It would have been prima facie obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to have modified the information processing apparatus of Sheppard (as modified by Roblek) by adding the lyrics of Tomokazu to increase the possibility of providing the user with a piece of music that matches the user's intentions and preferences (Tomokazu ¶0005).
Regarding claim 27, Sheppard (in view of Roblek and further in view of Tomokazu) teaches an information processing apparatus comprising the features of claim 26 as discussed above.
Tomokazu further suggests that the music feature information includes music format information in which identification information of the score information and identification information of the lyric information for a same bar are registered in association with each other, and music order information indicating an order of the music format information (Tomokazu ¶0063: "It is also possible to determine the presentation order of the N pieces of music in each unit section according to the content of the portion λ of the lyrics L of each unit section.").
Regarding claim 28, Sheppard (in view of Roblek and further in view of Tomokazu) teaches an information processing apparatus comprising the features of claim 24 as discussed above.
Tomokazu further suggests that the communication interface is configured to receive instruction information for selecting any one piece of the score information (Tomokazu ¶0056: "If user U selects presentation of N songs in an order according to the presentation time"), and the circuitry is configured to rank the music feature information including the score information selected by the instruction information by using a predetermined rule; and extract the music feature information arranged in a preset rank (¶0056: "the presentation order of the N songs is determined so that when the presentation time falls within the morning time slot, songs with a "cheerful" melody are ranked higher, when the presentation time falls within the daytime afternoon time slot, songs with a "intense" melody are ranked higher, and when the presentation time falls within the nighttime time slot, songs with a "calm" melody are ranked higher.").
Regarding claim 29, Sheppard (in view of Roblek and further in view of Tomokazu) teaches an information processing apparatus comprising the features of claim 26 as discussed above.
Tomokazu further suggests that the communication interface is configured to receive instruction information for instructing to search lyrics (Tomokazu ¶0016; "Specifically, the user U can specify the lyrics L of the song and a number of song attributes. The lyrics L are a character string expressing the pronunciation of the singing part of the song, and are arbitrarily designated by the user U." This inherently involves searching for a match to the string input by the user.), and the circuitry is configured to rank the music feature information including lyric information including the lyrics for which instruction of searching is given by the instruction information by using a predetermined rule; and extract the music feature information arranged in a preset rank (Tomokazu ¶0063: "For example, as illustrated in FIG. 8, part λ of lyrics L is 'It's sunny.' For the unit section where the meaning or emotion of the part λ is "It's raining," songs with a 'cheerful' melody that correlates with the meaning or emotion of the part λ are presented higher than songs with a 'lonely' melody, and the part λ of the lyrics L is 'It's raining.' For a unit section in which λ is 'lonely,' music pieces with a 'lonely' melody tone that correlates with the meaning or emotion of the portion λ are presented higher than music pieces with a 'cheerful' melody tone. The above configuration has the advantage that the user U can easily select a piece of music having a melody that harmonizes with each portion λ of the lyrics L.").
Regarding claim 30, Sheppard (in view of Roblek and further in view of Tomokazu) teaches an information processing apparatus comprising the features of claim 23 as discussed above.
Tomokazu further suggests that the terminal apparatus is a producer terminal apparatus in which an application related to creation of music is installed (Tomokazu ¶0012: "The terminal device 12 is, for example, a portable communication terminal such as a mobile phone or a smartphone, or a portable or fixed communication terminal such as a personal computer."), the instruction information is operation history information indicating, in response to activation of the application, a history of an operation on the producer terminal apparatus by a producer of music (Tomokazu ¶0023: "The storage device 34 also stores, for each of a plurality of genres of music that can be generated by the automatic composition process, a history HB of music of that genre that has been generated in the past (hereinafter referred to as the 'generation history')."), and the circuitry is configured to obtain music information for which a number of executions of predetermined operation is determined to exceed a threshold value on a basis of the operation history information (Tomokazu ¶0069: "generation history HB"; Tomokazu inherently exceeds a threshold value of zero.); rank music feature information used for the obtained music information in descending order of the number of executions of predetermined operation (Tomokazu ¶0069: "the genre that ranks highest in descending order of the number of times indicated by the generation history HB is selected from among multiple genres"); extract the music feature information arranged in a preset rank (Tomokazu ¶0069: "the genre that ranks highest in descending order of the number of times indicated by the generation history HB is selected from among multiple genres"); and output the presentation information of the extracted music feature information to the producer terminal apparatus (Tomokazu ¶0018: "the management device 30 manages the automatic composition process by the processing device 40 and the presentation of the N pieces of music to the terminal device 12.").
Regarding claim 31, Sheppard (in view of Roblek and further in view of Tomokazu) teaches an information processing apparatus comprising the features of claim 23 as discussed above.
Sheppard further discloses that the terminal apparatus is a user terminal apparatus in which an application for listening to music is installed (Sheppard ¶0097: "FIG. 1a shows a schematic view of an example user device 10 for generating music… The user device comprises at least one processor 12, which… may be configured to output generated music/a cellular composition to a user"; Sheppard ¶0096: "The instructions may be provided as part of a “software app” which the user may be able to download, install and run on the user device.").
Tomokazu further suggests that the instruction information is operation history information indicating, in response to activation of the application, a history of an operation on the user terminal apparatus by a user who listens to the music (Tomokazu ¶0053: "a user U selects presentation of N songs in an order according to the music composition history"), and the circuitry is configured to analyze the operation history information to obtain a number of executions of each operation (Tomokazu ¶0053: "the music presentation unit 64 determines the order in which to present the N songs according to the history of the music composition unit 66's past generation of music data D").
Regarding claim 32, Sheppard (in view of Roblek and further in view of Tomokazu) teaches an information processing apparatus comprising the features of claim 31 as discussed above.
Tomokazu further suggests that the circuitry is configured to: obtain music information for which a number of executions of predetermined operation is determined to exceed a threshold value on a basis of results of analysis (Tomokazu ¶0069: "generation history HB"; Tomokazu inherently exceeds a threshold value of zero.); rank music feature information used for the obtained music information in descending order of the number of executions of predetermined operation (Tomokazu ¶0069: "the genre that ranks highest in descending order of the number of times indicated by the generation history HB is selected from among multiple genres"); extract the music feature information arranged in a preset rank (Tomokazu ¶0069: "the genre that ranks highest in descending order of the number of times indicated by the generation history HB is selected from among multiple genres"); and output the presentation information of the extracted music feature information to a producer terminal apparatus in which an application related to creation of music is installed (Tomokazu ¶0018: "the management device 30 manages the automatic composition process by the processing device 40 and the presentation of the N pieces of music to the terminal device 12.").
Regarding claim 33, Sheppard (in view of Roblek and further in view of Tomokazu) teaches an information processing apparatus comprising the features of claim 31 as discussed above.
Tomokazu further suggests that the circuitry is configured to obtain music information for which a number of executions of predetermined operation is determined to exceed a threshold value (Tomokazu ¶0069: "generation history HB"; Tomokazu inherently exceeds a threshold value of zero.); generate a playlist on a basis of the obtained music information (Tomokazu ¶0007: "a plurality of pieces of music including accompaniment sounds in a genre specified by the user are generated, so that it is possible to provide pieces of music with accompaniment sounds that match the user's intentions and preferences."); and output the playlist to the user terminal apparatus (Tomokazu ¶0018: "the management device 30 manages the automatic composition process by the processing device 40 and the presentation of the N pieces of music to the terminal device 12.").
Regarding claim 34, Sheppard (in view of Roblek and further in view of Tomokazu) teaches an information processing apparatus comprising the features of claim 31 as discussed above.
Roblek further suggests that the circuitry is configured to: compose music information using machine learning on a basis of the music feature information (Roblek ¶0071: "In particular, the model trainer 160 can train a music generation model 120 or 140 or portions thereof based on a set of training data 162. In some implementations, the music generation models 120 or 140 can include deep neural network that provides a musical embedding for an input text (e.g., for each of one or more portions of the input text). In some of such implementations, deep neural network can be trained on or using a training dataset 162 that includes a plurality of sets of lyrics from humanly-generated songs, wherein each set of lyrics is annotated with one or more music features descriptive of the backing music associated with such set of lyrics in the corresponding humanly-generated song. Example music features include a tempo feature, a loudness feature, a dynamic feature, a pitch histogram feature, a dominant melodic interval feature, a direction of motion feature, and a dominant harmony feature.").
Tomokazu further suggests: obtain the music information for which a number of executions of predetermined operation is determined to exceed a threshold value (Tomokazu ¶0069: "generation history HB"; Tomokazu inherently exceeds a threshold value of zero.); rank music feature information used for the obtained music information in descending order of the number of executions of predetermined operation (Tomokazu ¶0069: "the genre that ranks highest in descending order of the number of times indicated by the generation history HB is selected from among multiple genres"); extract the music feature information arranged in a preset rank (Tomokazu ¶0069: "the genre that ranks highest in descending order of the number of times indicated by the generation history HB is selected from among multiple genres"); recompose or arrange the music information on a basis of the music feature information extracted by the extraction unit (Tomokazu ¶0018: "the management device 30 manages the automatic composition process by the processing device 40 and the presentation of the N pieces of music to the terminal device 12."); and output the recomposed or arranged music information to the user terminal apparatus (Tomokazu ¶0047: "The control device 32 of the terminal device 12 supplies the music data D received by the communication device 22 from the management device 30 to the sound output device 25, thereby playing back the selected music piece (SA13).").
Regarding claim 35, Sheppard (in view of Roblek and further in view of Tomokazu) teaches an information processing apparatus comprising the features of claim 23 as discussed above.
Sheppard further suggests that the terminal apparatus is a user terminal apparatus in which an application for listening to music is installed (Sheppard ¶0097: "FIG. 1a shows a schematic view of an example user device 10 for generating music… The user device comprises at least one processor 12, which… may be configured to output generated music/a cellular composition to a user"; Sheppard ¶0096: "The instructions may be provided as part of a “software app” which the user may be able to download, install and run on the user device."), the instruction information is action history information indicating a movement history of the user terminal apparatus (Sheppard ¶0026: "In embodiments, the processor is configured to: receive a change in user data relating to direction of travel; and modify, responsive to the change in user data, a key of the composed music."), and the circuitry is configured to obtain music information played by the user terminal apparatus (Sheppard ¶0023: "The user data may comprise one or more of: time of received request to compose music, date of received request to compose music, weather conditions when request to compose music is received"); and analyze the action history information to obtain a position of the user (Sheppard ¶0023: "The user data may comprise one or more of… pace, speed… location, GPS location, and direction of movement." Sheppard ¶0166: "In embodiments, the method may comprise determining if the user's location has changed. A user's location may change if the user has started moving.").
Regarding claim 36, Sheppard (in view of Roblek and further in view of Tomokazu) teaches an information processing apparatus comprising the features of claim 35 as discussed above.
Tomokazu further suggests that the circuitry is configured to: rank the music feature information used for music information played a number of times exceeding a threshold value (Tomokazu ¶0070: "generation history HA and generation history HB"; Tomokazu inherently exceeds a threshold value of zero.) at a predetermined place by using a predetermined rule on a basis of results of the analysis (Tomokazu ¶0070: "it is possible to set the generation history HA and generation history HB to, for example, an average history of the results of past selections by a large number of users across the country"; The country is the predetermined place.); extract the music feature information arranged in a preset rank (Tomokazu ¶0069: "the genre that ranks highest in descending order of the number of times indicated by the generation history HB is selected from among multiple genres"); and output the presentation information of the extracted music feature information to a producer terminal apparatus (Tomokazu ¶0018: "the management device 30 manages the automatic composition process by the processing device 40 and the presentation of the N pieces of music to the terminal device 12.") in which an application related to creation of music is installed (Tomokazu ¶0012: "The terminal device 12 is, for example, a portable communication terminal such as a mobile phone or a smartphone, or a portable or fixed communication terminal such as a personal computer.").
Regarding claim 37, Sheppard (in view of Roblek and further in view of Tomokazu) teaches an information processing apparatus comprising the features of claim 35 as discussed above.
Tomokazu further suggests that the circuitry is configured to obtain music information for which a number of executions of predetermined operation at a predetermined place (Tomokazu ¶0070: "it is possible to set the generation history HA and generation history HB to, for example, an average history of the results of past selections by a large number of users across the country"; The country is the predetermined place.) is determined to exceed a threshold value (Tomokazu inherently exceeds a threshold value of zero.); generate a playlist on a basis of the obtained music information (Tomokazu ¶0072: "In the above-described embodiments, N songs are presented to the user U as candidates for playback of the song data D."); and output the playlist to the user terminal apparatus located at the predetermined place (Tomokazu ¶0018: "the management device 30 manages the automatic composition process by the processing device 40 and the presentation of the N pieces of music to the terminal device 12.").
Regarding claim 38, Sheppard (in view of Roblek and further in view of Tomokazu) teaches an information processing apparatus comprising the features of claim 35 as discussed above.
Roblek further suggests that the circuitry is configured to compose music information using machine learning on a basis of the music feature information (Roblek ¶0071: "In particular, the model trainer 160 can train a music generation model 120 or 140 or portions thereof based on a set of training data 162. In some implementations, the music generation models 120 or 140 can include deep neural network that provides a musical embedding for an input text (e.g., for each of one or more portions of the input text). In some of such implementations, deep neural network can be trained on or using a training dataset 162 that includes a plurality of sets of lyrics from humanly-generated songs, wherein each set of lyrics is annotated with one or more music features descriptive of the backing music associated with such set of lyrics in the corresponding humanly-generated song. Example music features include a tempo feature, a loudness feature, a dynamic feature, a pitch histogram feature, a dominant melodic interval feature, a direction of motion feature, and a dominant harmony feature.").
Tomokazu further suggests: rank the music feature information used for music information for which a number of executions of predetermined operation at a predetermined place (Tomokazu ¶0070: "it is possible to set the generation history HA and generation history HB to, for example, an average history of the results of past selections by a large number of users across the country"; The country is the predetermined place.) is determined to exceed a threshold value (Tomokazu inherently exceeds a threshold value of zero.), the rank being in descending order of the number of executions of the predetermined operation (Tomokazu ¶0069: "the genre that ranks highest in descending order of the number of times indicated by the generation history HB is selected from among multiple genres"); extract the music feature information arranged in a preset rank (Tomokazu ¶0069: "the genre that ranks highest in descending order of the number of times indicated by the generation history HB is selected from among multiple genres"); recompose or arrange the music information on a basis of the extracted music feature information (Tomokazu ¶0018: "the management device 30 manages the automatic composition process by the processing device 40 and the presentation of the N pieces of music to the terminal device 12."); output the recomposed or arranged music information to the user terminal apparatus located at the predetermined place (Tomokazu ¶0047: "The control device 32 of the terminal device 12 supplies the music data D received by the communication device 22 from the management device 30 to the sound output device 25, thereby playing back the selected music piece (SA13).").
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PHILIP SCOLES whose telephone number is (703)756-1831. The examiner can normally be reached Monday-Friday 8:30-4:30 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Dedei Hammond can be reached on 571-270-7938. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/PHILIP G SCOLES/
Examiner, Art Unit 2837
/JEFFREY DONELS/Primary Examiner, Art Unit 2837