(AI - SIMILAR Search)Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claims 1-22 are present in this application. Claims 1-22 are pending in this office
action.
This office action is NON-FINAL.
Drawings
The Drawings filed on 06/09/23 are acceptable for examination purposes.
Specification
The Specification filed on 06/09/23 is acceptable for examination purposes.
Information Disclosure Statement
The information disclosure statements (IDS) filed on 06/09/23, 06/06/24, 08/28/25 and 02/20/25 have been considered by the Examiner and made of record in the application file.
Examiner’s Note - 35 U.S.C. § 112
The following is a quotation of 35 U.S.C. § 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination
may be expressed as a means or step for performing a specified function without
the recital of structure, material, or acts in support thereof, and such claim shall
be construed to cover the corresponding structure, material, or acts described in
the specification and equivalents thereof.
Use of the word “means” (or “step for”) in a claim with functional language
creates a rebuttable presumption that the claim element is to be treated in accordance
with 35 U.S.C. § 112(f). The presumption that 35 U.S.C. § 112(f) is invoked is rebutted
on when the function is recited with sufficient structure, material, or acts within the claim
itself to entirely perform the recited function.
Absence of the word “means” (or “step for”) in a claim creates a rebuttable
presumption that the claim element is not to be treated in accordance with 35 U.S.C. §
112(f). The presumption that 35 U.S.C. § 112(f) is not invoked is rebutted when the
claim element recites function but fails to recite sufficiently definite structure, material or
acts to perform that function.
Claim elements in this application that use the word “means” (or “step for”) are
presumed to invoke 35 U.S.C. § 112(f) except as otherwise indicated in an Office
action. Similarly, claim elements that do not use the word “means” (or “step for”) are
presumed not to invoke 35 U.S.C. § 112(f) except as otherwise indicated in an Office
action.
Claim 1-22 limitations “data input means for inputting,” “sequence data selecting means for”, “display means for displaying,” and “a generation unit that generates a sequence,” “the generation unit generates a sequence”, have been interpreted under 35 U.S.C. § 112(f) because they use generic placeholders “input means for,” “selecting means for,” and “display means for,” and coupled with functional language “selecting,” “displaying,” “sequence,” without reciting sufficient structure to achieve the functions. Furthermore, the generic placeholders are not preceded by structural modifiers.
Since the claim limitations invokes 35 U.S.C. § 112(f), the specification was
reviewed to find a description of the corresponding structure to achieve the claimed
functions. Examiner found that the specification does not explicitly show a specific
If Applicants wishes to provide further explanation or dispute the examiner’s
interpretation of the corresponding structure, Applicants must identify the corresponding
structure with reference to the specification by page and line number, and to the
drawing, if any, by reference characters in response to this Office action.
If Applicants does not intend to have the claim limitations treated under 35 U.S.C.
§ 112(f) Applicants may amend the claims so that they will clearly not invoke 35 U.S.C.
§ 112(f) or present a sufficient showing that the claims recites sufficient structure,
material, or acts for performing the claimed functions to preclude application of 35
U.S.C. § 112(f).
For more information, see MPEP § 2173 et seq. and Supplementary Examination
Guidelines for Determining Compliance With 35 U.S.C. § 112 and for Treatment of
Related Issues in Patent Applications, 76 FR 7162, 7167 (Feb. 9, 2011).
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine,
manufacture, or composition of matter, or any new and useful improvement
thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-22 are rejected under 35 U.S.C. 101 because the claimed invention is
directed to an abstract idea without significantly more.
Claim 1 recites, “data input means for inputting sequence data; a machine learning model that generates new sequence data based on the sequence data input by the data input means; and sequence data selecting means for, when the new sequence data is generated by the machine learning model, selecting target sequence data for changing the sequence data and/or context sequence data for not changing the sequence data, wherein the control means: (i) generates new target sequence data that interpolates at least two sequence data already generated by the machine learning model; or (ii) generates new different sequence data for the sequence data already generated by the machine learning model”.
The limitation of “data input means for inputting sequence data; a machine learning model that generates new sequence data based on the sequence data input by the data input means; and sequence data selecting means for, when the new sequence data is generated by the machine learning model, selecting target sequence data for changing the sequence data and/or context sequence data for not changing the sequence data, wherein the control means: (i) generates new target sequence data that interpolates at least two sequence data already generated by the machine learning model; or (ii) generates new different sequence data for the sequence data already generated by the machine learning model”. Nothing in the claim element precludes the step from practically being performed in the mind. Accordingly, the claim recites an abstract idea. This judicial exception is not integrated into a practical application. The claim is directed to an abstract idea. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.
Claim 2 is dependent on claim 1 and includes all the limitations of claim 1. Claim
2 recites displaying, in a designation enabled form, a position in a space that defines a feature value of the sequence data learned by the machine learning model, wherein the control means generates, as the new sequence data, sequence data having a feature value corresponding to a designated position in the space in claim 2. But displaying, in a designation enabled form, a position in a space that defines a feature value of the sequence data learned by the machine learning model, wherein the control means generates, as the new sequence data, sequence data having a feature value corresponding to a designated position in the space does not go beyond the abstract idea itself. There are no additional components in the claim that would make it significantly more than the abstract idea.
Claim 3 recites, “generates a sequence including a determined context sequence and a new target sequence using input information and a learned model, the input information being information concerning a sequence in which a part is configured by a target sequence and a remainder is configured by a context sequence and that provides a series of information, wherein when data corresponding to the input information is input, the learned model outputs data corresponding to the new target sequence”.
The limitation of “generates a sequence including a determined context sequence and a new target sequence using input information and a learned model, the input information being information concerning a sequence in which a part is configured by a target sequence and a remainder is configured by a context sequence and that provides a series of information, wherein when data corresponding to the input information is input, the learned model outputs data corresponding to the new target sequence”. Nothing in the claim element precludes the step from practically being performed in the mind. Accordingly, the claim recites an abstract idea. This judicial exception is not integrated into a practical application. The claim is directed to an abstract idea. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.
Claim 4 is dependent on claim 3 and includes all the limitations of claim 3. Claim
4 recites wherein the input information includes: the determined context sequence; and position information of the determined context sequence in the sequence in claim 4. But the determined context sequence; and position information of the determined context sequence in the sequence does not go beyond the abstract idea itself. There are no additional components in the claim that would make it significantly more than the abstract idea.
Claim 5 is dependent on claim 3 and includes all the limitations of claim 3. Claim
5 recites wherein the input information includes information concerning the sequence generated by the generation unit, and the generation unit generates a sequence including, as the new target sequence, a target sequence different from the target sequence of the sequence generated by the generation unit in claim 5. But the input information includes information concerning the sequence generated by the generation unit, and the generation unit generates a sequence including, as the new target sequence, a target sequence different from the target sequence of the sequence generated by the generation unit does not go beyond the abstract idea itself. There are no additional components in the claim that would make it significantly more than the abstract idea.
Claim 6 is dependent on claim 3 and includes all the limitations of claim 3. Claim
6 recites wherein the input information includes information concerning the sequence generated by the generation unit, and the generation unit generates a sequence including, as the new target sequence, a target sequence different from the target sequence of the sequence generated by the generation unit in claim 6. But wherein the input information includes information concerning the sequence generated by the generation unit, and the generation unit generates a sequence including, as the new target sequence, a target sequence different from the target sequence of the sequence generated by the generation unit does not go beyond the abstract idea itself. There are no additional components in the claim that would make it significantly more than the abstract idea.
Claim 7 is dependent on claim 3 and includes all the limitations of claim 3. Claim
7 recites wherein the input information includes information for designating two sequences among a plurality of sequences generated by the generation unit, and the generation unit generates a sequence including, as the new target sequence, a target sequence having a feature between features of target sequences of the designated two sequences in claim 7. But information for designating two sequences among a plurality of sequences generated by the generation unit, and the generation unit generates a sequence including, as the new target sequence, a target sequence having a feature between features of target sequences of the designated two sequence does not go beyond the abstract idea itself. There are no additional components in the claim that would make it significantly more than the abstract idea.
Claim 8 is dependent on claim 3 and includes all the limitations of claim 3. Claim
8 recites wherein the input information includes information for designating a feature of a sequence, and the generation unit generates a sequence having the designated feature in claim 8. But the input information includes information for designating a feature of a sequence, and the generation unit generates a sequence having the designated feature does not go beyond the abstract idea itself. There are no additional components in the claim that would make it significantly more than the abstract idea.
Claim 9 is dependent on claim 3 and includes all the limitations of claim 3. Claim
9 recites wherein the data input to the learned model includes a token of the determined context sequence, and the data output by the learned model includes a token of the new target sequence in claim 9. But the data input to the learned model includes a token of the determined context sequence, and the data output by the learned model includes a token of the new target sequence does not go beyond the abstract idea itself. There are no additional components in the claim that would make it significantly more than the abstract idea.
Claim 10 is dependent on claim 3 and includes all the limitations of claim 3. Claim 10 recites wherein the data input to the learned model includes a token of the determined context sequence and a predetermined token, and the data output by the learned model includes a token of the new target sequence in claim 10. But the data input to the learned model includes a token of the determined context sequence and a predetermined token, and the data output by the learned model includes a token of the new target sequence does not go beyond the abstract idea itself. There are no additional components in the claim that would make it significantly more than the abstract idea.
Claim 11 is dependent on claim 3 and includes all the limitations of claim 9. Claim 11 recites wherein a series of information given by the sequence is music information indicating a pitch value of sound for each time, and the token indicates at least one of the pitch value of the sound and a generation period of the sound in claim 11. But a series of information given by the sequence is music information indicating a pitch value of sound for each time, and the token indicates at least one of the pitch value of the sound and a generation period of the sound does not go beyond the abstract idea itself. There are no additional components in the claim that would make it significantly more than the abstract idea.
Claim 12 recites, “generates a sequence including a determined context sequence and a new target sequence using input information and a learned model, the input information being information concerning a sequence in which a part is configured by a target sequence and a remainder is configured by a context sequence and that provides a series of information; and a user interface that receives the input information and presents a generation result of the generation unit, wherein when data corresponding to the input information is input, the learned model outputs data corresponding to the new target sequence”.
The limitation of “generates a sequence including a determined context sequence and a new target sequence using input information and a learned model, the input information being information concerning a sequence in which a part is configured by a target sequence and a remainder is configured by a context sequence and that provides a series of information; and a user interface that receives the input information and presents a generation result of the generation unit, wherein when data corresponding to the input information is input, the learned model outputs data corresponding to the new target sequence”. Nothing in the claim element precludes the step from practically being performed in the mind. Accordingly, the claim recites an abstract idea. This judicial exception is not integrated into a practical application. The claim is directed to an abstract idea. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.
Claim 13 is dependent on claim 12 and includes all the limitations of claim 12. Claim 13 recites wherein the user interface receives, as the input information, the determined context sequence and position information of the determined context sequence in a sequence in claim 13. But the user interface receives, as the input information, the determined context sequence and position information of the determined context sequence in a sequence does not go beyond the abstract idea itself. There are no additional components in the claim that would make it significantly more than the abstract idea.
Claim 14 is dependent on claim 12 and includes all the limitations of claim 12. Claim 14 recites wherein the user interface receives, as the input information, information concerning the sequence generated by the generation unit, and the generation unit generates a sequence including, as the new target sequence, a target sequence different from the target sequence of the sequence generated by the generation unit in claim 14. But the user interface receives, as the input information, information concerning the sequence generated by the generation unit, and the generation unit generates a sequence including, as the new target sequence, a target sequence different from the target sequence of the sequence generated by the generation unit does not go beyond the abstract idea itself. There are no additional components in the claim that would make it significantly more than the abstract idea.
Claim 15 is dependent on claim 12 and includes all the limitations of claim 12. Claim 15 recites wherein the user interface receives, as the input information, information for designating at least one sequence among a plurality of sequences generated by the generation unit, and the generation unit generates a sequence including, as the new target sequence, a target sequence different from a target sequence of the designated sequence in claim 15. But the user interface receives, as the input information, information for designating at least one sequence among a plurality of sequences generated by the generation unit, and the generation unit generates a sequence including, as the new target sequence, a target sequence different from a target sequence of the designated sequence does not go beyond the abstract idea itself. There are no additional components in the claim that would make it significantly more than the abstract idea.
Claim 16 is dependent on claim 12 and includes all the limitations of claim 12. Claim 16 recites wherein the user interface receives, as the input information, information for designating two sequences among a plurality of sequences generated by the generation unit, and the generation unit generates a sequence including, as the new target sequence, a target sequence having a feature between features of target sequences of the designated two sequences in claim 16. But the user interface receives, as the input information, information for designating two sequences among a plurality of sequences generated by the generation unit, and the generation unit generates a sequence including, as the new target sequence, a target sequence having a feature between features of target sequences of the designated two sequences does not go beyond the abstract idea itself. There are no additional components in the claim that would make it significantly more than the abstract idea.
Claim 17 is dependent on claim 12 and includes all the limitations of claim 12. Claim 17 recites wherein the user interface receives, as the input information, information for designating a feature of a sequence, and the generation unit generates a sequence having a designated feature in claim 17. But the user interface receives, as the input information, information for designating a feature of a sequence, and the generation unit generates a sequence having a designated feature does not go beyond the abstract idea itself. There are no additional components in the claim that would make it significantly more than the abstract idea.
Claim 18 is dependent on claim 12 and includes all the limitations of claim 12. Claim 18 recites wherein the data input to the learned model includes a token of the determined context sequence, and the data output by the learned model includes a token of the new target sequence in claim 18. But the data input to the learned model includes a token of the determined context sequence, and the data output by the learned model includes a token of the new target sequence does not go beyond the abstract idea itself. There are no additional components in the claim that would make it significantly more than the abstract idea.
Claim 19 is dependent on claim 12 and includes all the limitations of claim 12. Claim 19 recites wherein the data input to the learned model includes a token of the determined context sequence and a predetermined token, and the data output by the learned model includes a token of the new target sequence in claim 19. But the data input to the learned model includes a token of the determined context sequence and a predetermined token, and the data output by the learned model includes a token of the new target sequence does not go beyond the abstract idea itself. There are no additional components in the claim that would make it significantly more than the abstract idea.
Claim 20 is dependent on claim 18 and includes all the limitations of claim 18. Claim 20 recites wherein a series of information given by the sequence is music information indicating a pitch value of sound for each time, and the token indicates at least one of the pitch value of the sound and a generation period the sound in claim 20. But a series of information given by the sequence is music information indicating a pitch value of sound for each time, and the token indicates at least one of the pitch value of the sound and a generation period the sound does not go beyond the abstract idea itself. There are no additional components in the claim that would make it significantly more than the abstract idea.
Claim 21 recites, “generating a sequence including a determined context sequence and a new target sequence using input information and a learned model, the input information being information concerning a sequence in which a part is configured by a target sequence and a remainder is configured by a context sequence and that gives a series of information, wherein when data corresponding to the input information is input, the learned model outputs data corresponding to the new target sequence”.
The limitation of “generating a sequence including a determined context sequence and a new target sequence using input information and a learned model, the input information being information concerning a sequence in which a part is configured by a target sequence and a remainder is configured by a context sequence and that gives a series of information, wherein when data corresponding to the input information is input, the learned model outputs data corresponding to the new target sequence”. Nothing in the claim element precludes the step from practically being performed in the mind. Accordingly, the claim recites an abstract idea. This judicial exception is not integrated into a practical application. The claim is directed to an abstract idea. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.
Claim 22 recites, “generating a sequence including a determined context sequence and a new target sequence using input information and a learned model, the input information being information concerning a sequence in which a part is configured by a target sequence and a remainder is configured by a context sequence and that gives a series of information, wherein when data corresponding to the input information is input, the learned model outputs data corresponding to the new target sequence”.
The limitation of “generating a sequence including a determined context sequence and a new target sequence using input information and a learned model, the input information being information concerning a sequence in which a part is configured by a target sequence and a remainder is configured by a context sequence and that gives a series of information, wherein when data corresponding to the input information is input, the learned model outputs data corresponding to the new target sequence”. Nothing in the claim element precludes the step from practically being performed in the mind. Accordingly, the claim recites an abstract idea. This judicial exception is not integrated into a practical application. The claim is directed to an abstract idea. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.
Claim Rejections 35 U.S.C. §103
In the event the determination of the status of the application as subject to AIA 35
U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any
correction of the statutory basis for the rejection will not be considered a new ground of
rejection if the prior art relied upon, and the rationale supporting the rejection, would be
the same under either status.
A patent for a claimed invention may not be obtained, notwithstanding that the
claimed invention is not identically disclosed as set forth in section 102, if the
differences between the claimed invention and the prior art are such that the
claimed invention as a whole would have been obvious before the effective filing
date of the claimed invention to a person having ordinary skill in the art to which
the claimed invention pertains. Patentability shall not be negated by the manner in
which the invention was made.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all
obviousness rejections set forth in this Office action:
Claims 1-22 are rejected under 35 U.S.C. 103 as being unpatentable
over MATSUMURA et al. (US 2019/0012611 A1) in view of DAIDO et al. (US 2021/0256960).
Regarding claim 1 MATSUMURA an information processing apparatus comprising: control means;
data input means for inputting sequence data, (See MATSUMURA paragraph [0080], The training data generation part 113 inputs the generated random number sequences into the learning model unit 120 after learning to obtain their respective output values (sequences) (S105));
a machine learning model that generates new sequence data based on the sequence data input by the data input means, (See MATSUMURA paragraph [0124], the machine learning system 10 according to the present embodiment was able to autonomously achieve leaning for more complex data from short sequence data (L=5) input from the outside); and
sequence data selecting means for, when the new sequence data is generated by the machine learning model, (See MATSUMURA paragraph [0119], the machine learning system 10 repeated generation of new training data and relearning of the learning model unit 120 (self-learning of the learning model unit 120). The machine learning system 10 according to the present embodiment was able to autonomously generate the learning model unit 120 that is applied to the data with long sequence length (L=19)), selecting target sequence data for changing the sequence data, and/or context sequence data for not changing the sequence data, (See MATSUMURA paragraph [0032], the system designer already has the declarative knowledge of the target problem. For example, in the example of the problem of sorting a sequence of numbers, the system designer can determine whether the result of changing the order of the sequence is the correct sorting result, even without having the procedural knowledge (model) for obtaining the correct sorting result), wherein
the control means: (i) generates new target sequence data that interpolates at least two sequence data already generated by the machine learning model, (See MATSUMURA paragraph [0081], The training data generation part 113 selects new training data (teacher data) samples from the generated training data sample candidates based on the validation rule 147 (S106). In each sample selected as training data, the elements of the output number sequence correspond to the elements of the input random number sequence).
MATSUMURA does not explicitly disclose or generates new different sequence data for the sequence data already generated by the machine learning model.
However, DAIDO teaches or (ii) generates new different sequence data for the sequence data already generated by the machine learning model, (See DAIDO paragraph [0024], The piece of singer data Xa in the first embodiment are represented as an embedding vector in a multidimensional first space. The first space is a continuous space, in which a position corresponding to each singer in the space is determined in accordance with the acoustic features of the singing voice of the singer. The more similar the acoustic features of a singing voice of a first singer to that of a singing voice of a second singer among the different singers, the closer the vector of the first singer and the vector of the second singer in the first space).
It would have been obvious to one with ordinary skill in the art before the
effective filing date of the claimed invention was made, to modify or generates new different sequence data for the sequence data already generated by the machine learning model of DAIDO synthesize a target sound that is vocalized by a variety of persons speaking in a variety of performance styles.
Regarding claim 2, MATSUMURA taught the information processing apparatus of claim 1, as described above.
MATSUMURA does not explicitly disclose display means for displaying, in a designation enabled form, a position in a space, that defines a feature value of the sequence data learned by the machine learning model, wherein the control means generates, as the new sequence data, sequence data having a feature value corresponding to a designated position in the space.
However, DAIDO teaches display means for displaying, in a designation enabled form, a position in a space, ( See DAIDO paragraph [0079], the sound source data represents a position corresponding to the first sound source among different sound sources within a space representative of relations between acoustic features of the different sound sources), that defines a feature value of the sequence data learned by the machine learning model, (See DAIDO paragraph [0039], The learning processor 26 in the first embodiment collectively trains an encoding model E along with the synthesis model M as the main target of the machine learning….the synthesis model M outputs a series of feature data), wherein
the control means generates, as the new sequence data, (See DAIDO paragraph [0079], A piece of feature data Q includes a fundamental frequency Qa and a spectral envelope Qb of the audio signal V2. Apiece of feature datum Q is generated sequentially for each time unit), sequence data having a feature value corresponding to a designated position in the space, (See DAIDO paragraph [0079], The singer space refers to a continuous space, in which the position corresponding to each singer in the space is determined in accordance with acoustic features of the singing voice of the singer).
It would have been obvious to one with ordinary skill in the art before the
effective filing date of the claimed invention was made, to modify display means for displaying, in a designation enabled form, a position in a space, that defines a feature value of the sequence data learned by the machine learning model, wherein the control means generates, as the new sequence data, sequence data having a feature value corresponding to a designated position in the space of DAIDO synthesize a target sound that is vocalized by a variety of persons speaking in a variety of performance styles.
Regarding claim 3, MATSUMURA teaches an information processing apparatus comprising
a generation unit that generates a sequence including a determined context sequence and a new target sequence using input information and a learned model, (See MATSUMURA paragraph [0079], generates a random number sequence with the “length” of the predetermined number according to the procedure 451 for generating input data of new training data), the input information being information concerning a sequence in which a part is configured by a target sequence, (See MATSUMURA paragraph [0122], the learning with sequence length 6. A predictive value 335 that the learning model unit 120 output for the input value is different from a target value 333), wherein when data corresponding to the input information is input, the learned model outputs data corresponding to the new target sequence, (See MATSUMURA paragraph [0072], The sorting problem rearranges an input sequence of numbers in descending or ascending order. FIG. 4A shows an example of the information included in the self-training rule 145 for the sorting problem. The self-training rule 145 establishes a procedure 451 for generating input data of new training data).
MATSUMURA does not explicitly disclose a remainder is configured by a context sequence and that provides a series of information.
However, DAIDO teaches a remainder is configured by a context sequence and that provides a series of information, (See DAIDO paragraph [0028], A piece of feature data Q is generated sequentially for each time unit of predetermined length (e.g., 5 milliseconds). In other words, the synthesis processor 21 in the first embodiment generates the series of the fundamental frequencies Qa and the series of the spectral envelopes Qb in the sequential pieces of feature data). It would have been obvious to one with ordinary skill in the art before the
effective filing date of the claimed invention was made, to modify a remainder is configured by a context sequence and that provides a series of information of DAIDO synthesize a target sound that is vocalized by a variety of persons speaking in a variety of performance styles.
Regarding claim 4, MATSUMURA taught the information processing apparatus of claim 3, as described. MATSUMURA further teaches wherein
the input information includes, (See MATSUMURA paragraph [0054], The input device 242 is a hardware device by which the user inputs instructions and information to a text generation device 100):
the determined context sequence, (See MATSUMURA paragraph [0032], the system designer can determine whether the result of changing the order of the sequence is the correct sorting result); and
MATSUMURA does not explicitly disclose position information of the determined context sequence in the sequence.
However, DAIDO teaches position information of the determined context sequence in the sequence, (See DAIDO paragraph [0025], the note images Ga represent a series of notes of the tune represented by the audio signal V1. The display controller 22 disposes a series of note images Ga on the editing screen G in accordance with the condition data Xb generated by the signal analyzer 21. Specifically, the position of each note image Ga in the direction of the pitch axis is determined in accordance with a pitch of the corresponding note represented by the condition data Xb).
It would have been obvious to one with ordinary skill in the art before the
effective filing date of the claimed invention was made, to modify position information of the determined context sequence in the sequence of DAIDO synthesize a target sound that is vocalized by a variety of persons speaking in a variety of performance styles.
Regarding claim 5, MATSUMURA taught the information processing apparatus of claim 3, as described. MATSUMURA further teaches wherein the input information includes information concerning the sequence generated by the generation unit, (See MATSUMURA paragraph [0079], the training data generation part 113 generates a random number sequence with the “length” of the predetermined number according to the procedure 451 for generating input data of new training data (S104)), and
the generation unit generates a sequence including, as the new target sequence, (See MATSUMURA paragraph [0079], generates a random number sequence with the “length” of the predetermined number according to the procedure 451 for generating input data of new training data).
MATSUMURA does not explicitly disclose a target sequence different from the target sequence of the sequence generated by the generation unit.
However, DAIDO teaches a target sequence different from the target sequence of the sequence generated by the generation unit, (See DAIDO paragraph [0024], The piece of singer data Xa in the first embodiment are represented as an embedding vector in a multidimensional first space. The first space is a continuous space, in which a position corresponding to each singer in the space is determined in accordance with the acoustic features of the singing voice of the singer. The more similar the acoustic features of a singing voice of a first singer to that of a singing voice of a second singer among the different singers, the closer the vector of the first singer and the vector of the second singer in the first space).
It would have been obvious to one with ordinary skill in the art before the
effective filing date of the claimed invention was made, to modify a target sequence different from the target sequence of the sequence generated by the generation unit of DAIDO synthesize a target sound that is vocalized by a variety of persons speaking in a variety of performance styles.
Regarding claim 6, MATSUMURA taught the information processing apparatus of claim 3, as described. MATSUMURA further teaches wherein
the input information includes information for designating at least one sequence among a plurality of sequences generated by the generation unit, (See MATSUMURA paragraph [0073], The procedure 451 for generating input data of new training data represents a function for generating new input data x. The function returns a random number sequence with a predetermined length “length”), and
the generation unit generates a sequence including, as the new target sequence, (See MATSUMURA paragraph [0079], generates a random number sequence with the “length” of the predetermined number according to the procedure 451 for generating input data of new training data).
MATSUMURA does not explicitly disclose a target sequence different from a target sequence of the designated sequence.
However, DAIDO teaches a target sequence different from a target sequence of the designated sequence, (See DAIDO paragraph [0024], The piece of singer data Xa in the first embodiment are represented as an embedding vector in a multidimensional first space. The first space is a continuous space, in which a position corresponding to each singer in the space is determined in accordance with the acoustic features of the singing voice of the singer. The more similar the acoustic features of a singing voice of a first singer to that of a singing voice of a second singer among the different singers, the closer the vector of the first singer and the vector of the second singer in the first space).
It would have been obvious to one with ordinary skill in the art before the
effective filing date of the claimed invention was made, to modify a target sequence different from the target sequence of the sequence generated by the generation unit of DAIDO synthesize a target sound that is vocalized by a variety of persons speaking in a variety of performance styles.
Regarding claim 7, MATSUMURA taught the information processing apparatus of claim 3, as described. MATSUMURA further teaches wherein
the input information includes information for designating two sequences among a plurality of sequences generated by the generation unit, (See MATSUMURA paragraph [0073], The procedure 451 for generating input data of new training data represents a function for generating new input data x. The function returns a random number sequence with a predetermined length “length”), and
the generation unit generates a sequence including, as the new target sequence, (See MATSUMURA paragraph [0079], generates a random number sequence with the “length” of the predetermined number according to the procedure 451 for generating input data of new training data).
MATSUMURA does not explicitly disclose a target sequence having a feature between features of target sequences of the designated two sequences.
However, DAIDO teaches a target sequence having a feature between features of target sequences of the designated two sequences, (See DAIDO paragraph [0024], The piece of singer data Xa in the first embodiment are represented as an embedding vector in a multidimensional first space. The first space is a continuous space, in which a position corresponding to each singer in the space is determined in accordance with the acoustic features of the singing voice of the singer. The more similar the acoustic features of a singing voice of a first singer to that of a singing voice of a second singer among the different singers, the closer the vector of the first singer and the vector of the second singer in the first space).
It would have been obvious to one with ordinary skill in the art before the
effective filing date of the claimed invention was made, to modify a target sequence having a feature between features of target sequences of the designated two sequences of DAIDO synthesize a target sound that is vocalized by a variety of persons speaking in a variety of performance styles.
Regarding claim 8, MATSUMURA taught the information processing apparatus of claim 3, as described. MATSUMURA further teaches wherein
the input information includes information for designating a feature of a sequence, (See MATSUMURA paragraph [0073], The procedure 451 for generating input data of new training data represents a function for generating new input data x. The function returns a random number sequence with a predetermined length “length”), and
the generation unit generates a sequence having the designated feature, (See MATSUMURA paragraph [0032], The training data generation part 113 inputs the generated random number sequences into the learning model unit).
Regarding claim 9, MATSUMURA taught the information processing apparatus of claim 3, as described. MATSUMURA further teaches wherein the data output by the learned model includes a token of the new target sequence, (The training data generation part 113 inputs the generated random number sequences into the learning model unit 120 after learning to obtain their respective output values (sequences).
MATSUMURA does not explicitly disclose the data input to the learned model includes a token of the determined context sequence.
However, DAIDO teaches the data input to the learned model includes a token of the determined context sequence, (See DAIDO paragraph [0025], the note images Ga represent a series of notes of the tune represented by the audio signal V1. The display controller 22 disposes a series of note images Ga on the editing screen G in accordance with the condition data Xb generated by the signal analyzer 21. Specifically, the position of each note image Ga in the direction of the pitch axis is determined in accordance with a pitch of the corresponding note represented by the condition data Xb).
It would have been obvious to one with ordinary skill in the art before the
effective filing date of the claimed invention was made, to modify the data input to the learned model includes a token of the determined context sequence of DAIDO synthesize a target sound that is vocalized by a variety of persons speaking in a variety of performance styles.
Regarding claim 10, MATSUMURA taught the information processing apparatus of claim 3, as described. MATSUMURA further teaches wherein
the data input to the learned model includes a token of the determined context sequence and a predetermined token, (See . MATSUMURA paragraph [0073], The procedure 451 for generating input data of new training data represents a function for generating new input data x. The function returns a random number sequence with a predetermined length “length”), and
the data output by the learned model includes a token of the new target sequence, (See . MATSUMURA paragraph [0080], The training data generation part 113 inputs the generated random number sequences into the learning model unit 120 after learning to obtain their respective output values (sequences).
Regarding claim 11, MATSUMURA taught the information processing apparatus of claim 3, as described.
MATSUMURA does not explicitly disclose wherein a series of information given by the sequence is music information indicating a pitch value of sound for each time, and the token indicates at least one of the pitch value of the sound and a generation period of the sound.
However, DAIDO teaches wherein a series of information given by the sequence is music information indicating a pitch value of sound for each time, (See DADO paragraph [0018], the audio signal V1 represents the singing voice of a tune vocalized by a specific singer (hereinafter, referred to as an “additional singer”). Specifically, an audio signal V1 recorded in a recording medium, such as a music CD, or an audio signal V1 received via a communication network…the singing conditions include pitches, volumes, and phonetic identifiers), and the token indicates at least one of the pitch value of the sound and a generation period of the sound, (See DADO paragraph [0021], feature data Q representative of features of the singing voice. The condition data Xb in the first embodiment are a series of pieces of data which specify, as the singing conditions, a pitch, a phonetic identifier (a pronounced letter) and a sound period for each note of a series of notes in the tune).
It would have been obvious to one with ordinary skill in the art before the
effective filing date of the claimed invention was made, to modify wherein a series of information given by the sequence is music information indicating a pitch value of sound for each time, and the token indicates at least one of the pitch value of the sound and a generation period of the sound of DAIDO synthesize a target sound that is vocalized by a variety of persons speaking in a variety of performance styles.
Regarding claim 12, MATSUMURA teaches an information processing apparatus comprising: MATSUMURA further teaches
a generation unit that generates a sequence including a determined context sequence and a new target sequence using input information and a learned model, (See MATSUMURA paragraph [0081], The training data generation part 113 selects new training data (teacher data) samples from the generated training data sample candidates based on the validation rule 147 (S106). In each sample selected as training data, the elements of the output number sequence correspond to the elements of the input random number sequence), the input information being information concerning a sequence in which a part is configured by a target sequence, (See MATSUMURA paragraph [0122], the learning with sequence length 6. A predictive value 335 that the learning model unit 120 output for the input value is different from a target value 333), and and
a user interface that receives the input information and presents a generation result of the generation unit, (See MATSUMURA paragraph [0122], the result when the leaning with sequence length 5 is completed. An input value 321 is 0/1 binary data with sequence width 3 and sequence length 5. A start flag 301 and an end flag 303 are attached to the input value 321.), wherein
when data corresponding to the input information is input, the learned model outputs data corresponding to the new target sequence, (See MATSUMURA paragraph [0079], generates a random number sequence with the “length” of the predetermined number according to the procedure 451 for generating input data of new training data).
MATSUMURA does not explicitly disclose a remainder is configured by a context sequence and that provides a series of information.
However, DAIDO teaches a remainder is configured by a context sequence and that provides a series of information, (See DAIDO paragraph [0028], A piece of feature data Q is generated sequentially for each time unit of predetermined length (e.g., 5 milliseconds). In other words, the synthesis processor 21 in the first embodiment generates the series of the fundamental frequencies Qa and the series of the spectral envelopes Qb in the sequential pieces of feature data). It would have been obvious to one with ordinary skill in the art before the
effective filing date of the claimed invention was made, to modify a remainder is configured by a context sequence and that provides a series of information of DAIDO synthesize a target sound that is vocalized by a variety of persons speaking in a variety of performance styles.
Regarding claim 13, MATSUMURA taught the information processing apparatus of claim 12, as described. MATSUMURA further teaches wherein the user interface receives, as the input information, (See MATSUMURA paragraph [0054], an input device 242 and a display device 244 are connected to the input/output interface 240. The input device 242 is a hardware device by which the user inputs instructions and information to a text generation device 100).
MATSUMURA does not explicitly disclose the determined context sequence and position information of the determined context sequence in a sequence.
However, DAIDO teaches the determined context sequence and position information of the determined context sequence in a sequence, (See DAIDO paragraph [0025], the note images Ga represent a series of notes of the tune represented by the audio signal V1. The display controller 22 disposes a series of note images Ga on the editing screen G in accordance with the condition data Xb generated by the signal analyzer 21. Specifically, the position of each note image Ga in the direction of the pitch axis is determined in accordance with a pitch of the corresponding note represented by the condition data Xb).
It would have been obvious to one with ordinary skill in the art before the
effective filing date of the claimed invention was made, to modify the determined context sequence and position information of the determined context sequence in a sequence of DAIDO synthesize a target sound that is vocalized by a variety of persons speaking in a variety of performance styles.
Regarding claim 14, MATSUMURA taught the information processing apparatus of claim 12, as described. MATSUMURA further teaches wherein the user interface receives, as the input information, (See MATSUMURA paragraph [0054], an input device 242 and a display device 244 are connected to the input/output interface 240. The input device 242 is a hardware device by which the user inputs instructions and information to a text generation device 100), information concerning the sequence generated by the generation unit, and the generation unit generates a sequence, (See MATSUMURA paragraph [0081], The training data generation part 113 selects new training data (teacher data) samples from the generated training data sample candidates based on the validation rule 147 (S106). In each sample selected as training data, the elements of the output number sequence correspond to the elements of the input random number sequence), including, as the new target sequence, a target sequence different from the target sequence of the sequence generated by the generation unit, (See MATSUMURA paragraph [0079], generates a random number sequence with the “length” of the predetermined number according to the procedure 451 for generating input data of new training data).
Regarding claim 15, MATSUMURA taught the information processing apparatus of claim 12, as described. MATSUMURA further teaches wherein the user interface receives, as the input information, (See MATSUMURA paragraph [0054], an input device 242 and a display device 244 are connected to the input/output interface 240. The input device 242 is a hardware device by which the user inputs instructions and information to a text generation device 100), information for designating at least one sequence among a plurality of sequences generated by the generation unit, (See MATSUMURA paragraph [0073], The procedure 451 for generating input data of new training data represents a function for generating new input data x. The function returns a random number sequence with a predetermined length “length”), and the generation unit generates a sequence including, as the new target sequence, (See MATSUMURA paragraph [0079], generates a random number sequence with the “length” of the predetermined number according to the procedure 451 for generating input data of new training data), a target sequence different from a target sequence of the designated sequence, (See MATSUMURA paragraph [0032], The training data generation part 113 inputs the generated random number sequences into the learning model unit).
Regarding claim 16, MATSUMURA taught the information processing apparatus of claim 12, as described. MATSUMURA further teaches wherein the user interface receives, as the input information, (See MATSUMURA paragraph [0054], an input device 242 and a display device 244 are connected to the input/output interface 240. The input device 242 is a hardware device by which the user inputs instructions and information to a text generation device 100), information for designating two sequences among a plurality of sequences generated by the generation unit, (See MATSUMURA paragraph [0073], The procedure 451 for generating input data of new training data represents a function for generating new input data x. The function returns a random number sequence with a predetermined length “length”), and
the generation unit generates a sequence including, as the new target sequence, a target sequence having a feature between features of target sequences of the designated two sequences, (See MATSUMURA paragraph [0122], the learning with sequence length 6. A predictive value 335 that the learning model unit 120 output for the input value is different from a target value 333. There is a difference 337 between the predicted value 335 and the target value 333).
Regarding claim 17, MATSUMURA taught the information processing apparatus of claim 12, as described. MATSUMURA further teaches wherein the user interface receives, as the input information, (See MATSUMURA paragraph [0054], an input device 242 and a display device 244 are connected to the input/output interface 240. The input device 242 is a hardware device by which the user inputs instructions and information to a text generation device 100), information for designating a feature of a sequence, (See MATSUMURA paragraph [0073], The procedure 451 for generating input data of new training data represents a function for generating new input data x. The function returns a random number sequence with a predetermined length “length”), and
the generation unit generates a sequence having a designated feature, (See MATSUMURA paragraph [0073], The procedure 451 for generating input data of new training data represents a function for generating new input data x. The function returns a random number sequence with a predetermined length “length”).
Regarding claim 18, MATSUMURA taught the information processing apparatus of claim 12, as described. MATSUMURA further teaches wherein the data input to the learned model includes a token of the determined context sequence, (See MATSUMURA paragraph [0079], generates a random number sequence with the “length” of the predetermined number according to the procedure 451 for generating input data of new training data), and the data output by the learned model includes a token of the new target sequence, (See MATSUMURA paragraph [0072], The sorting problem rearranges an input sequence of numbers in descending or ascending order. FIG. 4A shows an example of the information included in the self-training rule 145 for the sorting problem. The self-training rule 145 establishes a procedure 451 for generating input data of new training data).
Regarding claim 19, MATSUMURA taught the information processing apparatus of claim 12, as described. MATSUMURA further teaches wherein the data input to the learned model, (See MATSUMURA paragraph [0073], The machine learning system 10 has learning mode and operation mode (process mode) for the learning model unit 120. In the operation mode, the learning model unit 120 generates output data for input data), includes a token of the determined context sequence and a predetermined token, (See . MATSUMURA paragraph [0073], The procedure 451 for generating input data of new training data represents a function for generating new input data x. The function returns a random number sequence with a predetermined length “length”) and
the data output by the learned model includes a token of the new target sequence, (See . MATSUMURA paragraph [0073], The procedure 451 for generating input data of new training data represents a function for generating new input data x. The function returns a random number sequence with a predetermined length “length”).
Regarding claim 20, MATSUMURA taught the information processing apparatus of claim 18, as described.
MATSUMURA does not explicitly disclose a series of information given by the sequence is music information indicating a pitch value of sound for each time, and the token indicates at least one of the pitch value of the sound and a generation period the sound.
However, DAIDO teaches wherein a series of information given by the sequence is music information indicating a pitch value of sound for each time, (See DADO paragraph [0018], the audio signal V1 represents the singing voice of a tune vocalized by a specific singer (hereinafter, referred to as an “additional singer”). Specifically, an audio signal V1 recorded in a recording medium, such as a music CD, or an audio signal V1 received via a communication network…the singing conditions include pitches, volumes, and phonetic identifiers), and the token indicates at least one of the pitch value of the sound and a generation period the sound, (See DADO paragraph [0021], feature data Q representative of features of the singing voice. The condition data Xb in the first embodiment are a series of pieces of data which specify, as the singing conditions, a pitch, a phonetic identifier (a pronounced letter) and a sound period for each note of a series of notes in the tune).
It would have been obvious to one with ordinary skill in the art before the
effective filing date of the claimed invention was made, to modify a series of information given by the sequence is music information, indicating a pitch value of sound for each time, and the token indicates at least one of the pitch value of the sound and a generation period the sound of DAIDO synthesize a target sound that is vocalized by a variety of persons speaking in a variety of performance styles.
Regarding claim 21, MATSUMURA teaches an information processing method comprising
generating a sequence including a determined context sequence and a new target sequence using input information and a learned model, (See MATSUMURA paragraph [0079], generates a random number sequence with the “length” of the predetermined number according to the procedure 451 for generating input data of new training data), the input information being information concerning a sequence in which a part is configured by a target sequence, (See MATSUMURA paragraph [0122], the learning with sequence length 6. A predictive value 335 that the learning model unit 120 output for the input value is different from a target value 333) and wherein
when data corresponding to the input information is input, the learned model outputs data corresponding to the new target sequence, (See MATSUMURA paragraph [0072], The sorting problem rearranges an input sequence of numbers in descending or ascending order. FIG. 4A shows an example of the information included in the self-training rule 145 for the sorting problem. The self-training rule 145 establishes a procedure 451 for generating input data of new training data).
MATSUMURA does not explicitly disclose a remainder is configured by a context sequence and that gives a series of information.
However, DAIDO teaches a remainder is configured by a context sequence and that gives a series of information, (See DAIDO paragraph [0028], A piece of feature data Q is generated sequentially for each time unit of predetermined length (e.g., 5 milliseconds). In other words, the synthesis processor 21 in the first embodiment generates the series of the fundamental frequencies Qa and the series of the spectral envelopes Qb in the sequential pieces of feature data). It would have been obvious to one with ordinary skill in the art before the
effective filing date of the claimed invention was made, to modify a remainder is configured by a context sequence and that gives a series of information of DAIDO synthesize a target sound that is vocalized by a variety of persons speaking in a variety of performance styles.
Regarding claim 22, MATSUMURA teaches an information processing program for causing a computer to execute
generating a sequence including a determined context sequence and a new target sequence using input information and a learned model, (See MATSUMURA paragraph [0079], generates a random number sequence with the “length” of the predetermined number according to the procedure 451 for generating input data of new training data), the input information being information concerning a sequence in which a part is configured by a target sequence, (See MATSUMURA paragraph [0122], the learning with sequence length 6. A predictive value 335 that the learning model unit 120 output for the input value is different from a target value 333) and, wherein
when data corresponding to the input information is input, the learned model outputs data corresponding to the new target sequence, See MATSUMURA paragraph [0072], The sorting problem rearranges an input sequence of numbers in descending or ascending order. FIG. 4A shows an example of the information included in the self-training rule 145 for the sorting problem. The self-training rule 145 establishes a procedure 451 for generating input data of new training data).
MATSUMURA does not explicitly disclose a remainder is configured by a context sequence and that gives a series of information.
However, DAIDO teaches a remainder is configured by a context sequence and that gives a series of information, (See DAIDO paragraph [0028], A piece of feature data Q is generated sequentially for each time unit of predetermined length (e.g., 5 milliseconds). In other words, the synthesis processor 21 in the first embodiment generates the series of the fundamental frequencies Qa and the series of the spectral envelopes Qb in the sequential pieces of feature data). It would have been obvious to one with ordinary skill in the art before the
effective filing date of the claimed invention was made, to modify a remainder is configured by a context sequence and that gives a series of information of DAIDO synthesize a target sound that is vocalized by a variety of persons speaking in a variety of performance styles.
Conclusions/Points of Contacts
The prior art made of record and not relied upon is considered pertinent to
applicant’s disclosure. See form PTO-892.
Gouyon et al. (US 2017/0140743 A1), a system for selecting audio similar to music provided to a client device comprises a processor and a computer-readable storage medium comprising instructions executable by the processor. The instructions comprise instructions for performing the following steps. Sponsored audio information received from a third-party sponsor is accessed. Reference music features describing characteristics of reference songs are obtained.
MAEZAWA (US 2020/0394989 A1) The directed graph G(V,E,L) generating performance tendency information indicating a performance tendency of a performance of a musical piece by a user from observational performance data representing the performance input to a learned model, and generating time series data of the musical piece according to the generated performance tendency information.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MULUEMEBET GURMU whose telephone number is (571)270-7095. The examiner can normally be reached M-F 9am - 5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tony Mahmoudi can be reached at 5712724078. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MULUEMEBET GURMU/Primary Examiner, Art Unit 2163