Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
DETAILED ACTION
This communication is in response to Application No. 18/778,923 filed on 20 July 2024.
Claims 1-20 are presented for examination.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 07/20/2024 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Claim Interpretation
5. Claim 20 recites “A computer program product for identification and resolution of anomalies over a network, the computer program product comprising a computer-readable storage medium having program instructions embodied therewith” Paragraph 0046 of Applicant’s specification states “the term A computer-readable storage medium, as that term is used in the disclosure, is not to be construed as storage in the form of transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide, light pulses passing through a fiber optic cable, electrical signals communicated through a wire, and/or other transmission media. As will be understood by those of skill in the art, data is typically moved at some occasional points in time during normal operations of a storage device, such as during access, de-fragmentation, or garbage collection, but this does not render the storage device as transitory because the data is not transitory while it is stored.” Therefore, Examiner interprets the computer-readable storage medium to be non-transitory, and therefore directed to a statutory subject matter.
Claim Rejections - 35 USC § 103
6. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
7. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
8. Claims 1-4, 14-16 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Black et al. (US 2020/0195662 A1); in view of Olson et al. (US 2023/0156024 A1).
Regarding Claim 1, Black teaches a computer-implemented method, comprising: obtaining, by a computer, at least a set of media characteristics associated with media data transmitted from a first entity to a second entity over a network and contextual information associated with at least one of the first entity or the second entity ([paragraph 0071-0072, 0074-0077] describes obtaining by a security analytics system (e.g. a computer) media attributes associated with electronic communication related to media transmitted from a first user or sending entity (e.g. a first entity) to a second user or receiving entity (e.g. a second entity) over a network and contextual information associated with a given user behavior (e.g. the first entity or the second entity);
determining, by the computer, a first operational score associated with the first entity based on at least the obtained set of media characteristics and the obtained contextual information ([paragraph 0071-0072, 0094, 0116-0117] describes determining by the security analytics system (e.g. the computer) an operational score related to a risk associated with the first user or sending entity (e.g. the first entity) according to obtained media attributes associated with electronic communication and obtained contextual information),
wherein the first operational score is indicative of operating conditions associated with the first entity for the transmission of the media data over the network ([paragraph 0116-0117, 0131-0132] describes the operational score related to the risk is indicative of condition existing at operation associated with the first user or sending entity (e.g. the first entity) for the electronic communication related to media transmission over the network);
identifying, by the computer, a set of anomalies associated with the first entity based on a comparison of the first operational score with a threshold ([paragraph 0071-0072, 0122-0124] describes identifying anomalous, abnormal, unexpected or malicious behavior associated with the first user or sending entity (e.g. the first entity) based on comparing the operational score related to the risk and the number of linked contacts with a threshold number);
Black doesn’t explicitly disclose determining, by the computer, a set of operations for resolving the set of anomalies associated with the first entity; and controlling, by the computer, the first entity to execute the determined set of operations on the media data.
However, in a similar field of endeavor, Olson discloses determining, by the computer, a set of operations for resolving the set of anomalies associated with the first entity ([paragraph 0002, 0038-0039] describes first entity and second entity and threshold metric associated with anomalous behavior may be determined associated with the first entity which is fraudulent based upon the threshold metric and system (the computer) for identifying fraudulent entities [paragraph 0124-0126] describes determining by the system (the computer) various operations for resolving anomalies associated with the first entity such as a reduction in transmission of fraudulent media requests (and/or a reduction in bandwidth) (e.g., as a result of discouraging fraudulent first entity from performing malicious actions to control client devices for transmission of media requests);
and controlling, by the computer, the first entity to execute the determined set of operations on the media data ([paragraph 0124-0126] describes fraudulent first entity may be discouraged from performing malicious actions (e.g., using one or more automated operation functionalities, hacking techniques, malware, etc.) to control client devices for transmission of media requests because, by implementing one or more of the techniques presented herein and as a result of identifying a fraudulent entity associated with fraudulent activity, as a result of controlling, such as restricting, transmission of data, such as content items and/or advertisements, to devices associated with the fraudulent entity based upon the identification of the fraudulent entity, etc.).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the teachings of Black to include determining, by the computer, a set of operations for resolving the set of anomalies associated with the first entity and controlling, by the computer, the first entity to execute the determined set of operations on the media data as taught by Olson. One ordinary skill in the art would be motivated to utilize the teachings of Black in the Olson system in order to provide for more accurately identifying fraudulent entities performing fraudulent activity ([paragraph 0090] in Olson).
Regarding Claim 2, the combination of Black and Olson teaches the computer-implemented method, wherein the contextual information comprises context cue information indicative of a comprehension of at least a first portion of the media data by a participant associated with the second entity ([Black: [paragraph 0043-0044, 0052, 0100] describes the contextual information includes context indication information which indicates awareness of media data which may include audio, image, video, text, or binary data associated with the second user or receiving entity (e.g. the second entity)).
Regarding Claim 3, the combination of Black and Olson teaches the computer-implemented method, wherein the set of media characteristics comprises at least one of a type of the media data, a resolution of the media data, a duration of media associated with the media data, timestamp data of the media associated with the media data, first bandwidth data associated with the transmission of the media data from the first entity, second bandwidth data associated with reception of the media data at the second entity, a rate of packet loss associated with communication of the media data, or an encryption state of the media data (Black: [paragraph 0043, 0113] describes media attributes includes type of media such as audio, image, video, text, or binary data, text messages, social media messages etc. [Black: paragraph 0091-0092] describes media attributes includes duration of the media or timestamp data of media associated with the media data).
Regarding Claim 4, the combination of Black and Olson teaches the computer-implemented method, further comprising: obtaining, by the computer, entity information associated with the first entity and the second entity, wherein the entity information comprises at least one of entity type information, entity identifier information, entity network information, entity location information, entity participant data, entity resource information, or entity status information (Black: [paragraph 0057, 0079-0080, 0092] describes obtaining entity information includes entity identifier information, certain email addresses and social media identifiers, financial account information, such as credit and debit card numbers, user/network 642 interactions between a user and a network, address, physical location, occupation, position, role, marital status, gender, association, affiliation, or assignment).
Regarding Claim 14, the combination of Black and Olson teaches the computer-implemented method, further comprising: determining, by the computer, a set of performance scores based on each operation of the set of operations (Black: [paragraph 0071-0072, 0094, 0104, 0116-0117] describes determining by the security analytics system (e.g. the computer) an operational score related to a risk performance (e.g. performance score) according to each operation of various operations);
selecting, by the computer, a first operation of the set of operations, wherein a first performance score of the first operation is highest among the set of performance scores (Black: [paragraph 0104, 0116-0117, 0122-0124] describes selecting by the security analytics system (e.g. the computer) an operational score related to a risk performance (e.g. first performance score) is highest among operational scores related to a risk performances (e.g. performance scores); and
controlling, by the computer, the first entity to execute the selected first operation of the set of operations on the media data (Olson: [paragraph 0124-0126] describes first entity may be discouraged (e.g. controlling) from performing malicious actions (e.g., using one or more automated operation functionalities, hacking techniques, malware, etc.) and first entity execute the selected operations on the media).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the teachings of Black to include controlling, by the computer, the first entity to execute the selected first operation of the set of operations on the media data as taught by Olson. One ordinary skill in the art would be motivated to utilize the teachings of Black in the Olson system in order to provide for more accurately identifying fraudulent entities performing fraudulent activity ([paragraph 0090] in Olson).
Regarding claim 15, Black teaches A system, comprising: processor set configured to ([paragraph 0005] describes a system comprising a processor):
obtain at least a set of media characteristics associated with media data received by a first entity from a second entity over a network and contextual information associated with at least one of the first entity or the second entity ([paragraph 0071-0072, 0074-0077] describes obtaining by a security analytics system (e.g. a computer) media attributes associated with electronic communication related to media transmitted from a first user or sending entity (e.g. a first entity) to a second user or receiving entity (e.g. a second entity) over a network and contextual information associated with a given user behavior (e.g. the first entity or the second entity);
determine a first operational score associated with the first entity based on at least the obtained set of media characteristics and the obtained contextual information ([paragraph 0071-0072, 0094, 0116-0117] describes determining by the security analytics system (e.g. the computer) an operational score related to a risk associated with the first user or sending entity (e.g. the first entity) according to obtained media attributes associated with electronic communication and obtained contextual information),
wherein the first operational score is indicative of operating conditions associated with the first entity for the reception of the media data over the network ([paragraph 0116-0117, 0131-0132] describes the operational score related to the risk is indicative of condition existing at operation associated with the first user or sending entity (e.g. the first entity) for the electronic communication related to media transmission over the network);
identify a set of anomalies associated with the first entity based on a comparison of the first operational score with a threshold ([paragraph 0071-0072, 0122-0124] describes identifying anomalous, abnormal, unexpected or malicious behavior associated with the first user or sending entity (e.g. the first entity) based on comparing the operational score related to the risk and the number of linked contacts with a threshold number);
Black doesn’t explicitly disclose determine a set of operations to resolve the set of anomalies associated with the first entity; and control the second entity to execute the determined set of operations on the media data.
However, in a similar field of endeavor, Olson discloses determine a set of operations to resolve the set of anomalies associated with the first entity ([paragraph 0002, 0038-0039] describes first entity and second entity and threshold metric associated with anomalous behavior may be determined associated with the first entity which is fraudulent based upon the threshold metric and system (the computer) for identifying fraudulent entities [paragraph 0124-0126] describes determining by the system (the computer) various operations for resolving anomalies associated with the first entity such as a reduction in transmission of fraudulent media requests (and/or a reduction in bandwidth) (e.g., as a result of discouraging fraudulent first entity from performing malicious actions to control client devices for transmission of media requests);
and control the second entity to execute the determined set of operations on the media data ([paragraph 0124-0126] describes fraudulent second entity may be discouraged from performing malicious actions (e.g., using one or more automated operation functionalities, hacking techniques, malware, etc.) to control client devices for transmission of media requests because, by implementing one or more of the techniques presented herein and as a result of identifying a fraudulent second entity associated with fraudulent activity, as a result of controlling, such as restricting, transmission of data, such as content items and/or advertisements, to devices associated with the fraudulent second entity based upon the identification of the fraudulent second entity, etc.).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the teachings of Black to include determining, by the computer, a set of operations for resolving the set of anomalies associated with the first entity and controlling, by the computer, the second entity to execute the determined set of operations on the media data as taught by Olson. One ordinary skill in the art would be motivated to utilize the teachings of Black in the Olson system in order to provide for more accurately identifying fraudulent entities performing fraudulent activity ([paragraph 0090] in Olson).
Regarding claim 16, this claim contains limitations found within that of claim 2 and the same rationale to rejection is used.
Regarding Claim 20, Black teaches a computer program product for identification and resolution of anomalies over a network, the computer program product comprising a computer-readable storage medium having program instructions embodied therewith, the program instructions executable by a system to cause the system to, comprising: processor set configured to ([paragraph 0005, 0051] describes a computer program for detecting anomalous, abnormal, unexpected or malicious user behavior, adaptively responding to mitigate risk over a network and the computer program product comprising a computer-readable storage medium having program instructions embodied therewith, the program instructions executable by a system to cause the system to, comprising: processor configured to):
obtain at least a set of media characteristics associated with media data communicated between a first entity and a second entity via the system over the network and contextual information associated with at least one of the first entity or the second entity ([paragraph 0071-0072, 0074-0077] describes obtaining by a security analytics system (e.g. a computer) media attributes associated with electronic communication related to media transmitted from a first user or sending entity (e.g. a first entity) to a second user or receiving entity (e.g. a second entity) over a network and contextual information associated with a given user behavior (e.g. the first entity or the second entity);
determine a first operational score associated with the system based on at least the obtained set of media characteristics and the obtained contextual information ([paragraph 0071-0072, 0094, 0116-0117] describes determining by the security analytics system (e.g. the computer) an operational score related to a risk associated with the first user or sending entity (e.g. the first entity) according to obtained media attributes associated with electronic communication and obtained contextual information),
wherein the first operational score is indicative of operating conditions associated with the system for the communication of the media data over the network ([paragraph 0116-0117, 0131-0132] describes the operational score related to the risk is indicative of condition existing at operation associated with the first user or sending entity (e.g. the first entity) for the electronic communication related to media transmission over the network);
identify a set of anomalies associated with the system based on a comparison of the first operational score with a threshold ([paragraph 0071-0072, 0122-0124] describes identifying anomalous, abnormal, unexpected or malicious behavior associated with the first user or sending entity (e.g. the first entity) based on comparing the operational score related to the risk and the number of linked contacts with a threshold number);
Black doesn’t explicitly disclose determine a set of operations to resolve the set of anomalies associated with the system; and execute the determined set of operations on the media data.
However, in a similar field of endeavor, Olson discloses determine a set of operations to resolve the set of anomalies associated with the system ([paragraph 0002, 0038-0039] describes first entity and second entity and threshold metric associated with anomalous behavior may be determined associated with the first entity which is fraudulent based upon the threshold metric and system (the computer) for identifying fraudulent entities [paragraph 0124-0126] describes determining by the system (the computer) various operations for resolving anomalies associated with the first entity such as a reduction in transmission of fraudulent media requests (and/or a reduction in bandwidth) (e.g., as a result of discouraging fraudulent first entity from performing malicious actions to control client devices for transmission of media requests); and
execute the determined set of operations on the media data ([paragraph 0124-0126] describes fraudulent first entity may be discouraged from performing malicious actions (e.g., using one or more automated operation functionalities, hacking techniques, malware, etc.) to control client devices for transmission of media requests because, by implementing one or more of the techniques presented herein and as a result of identifying a fraudulent entity associated with fraudulent activity, as a result of controlling, such as restricting, transmission of data, such as content items and/or advertisements, to devices associated with the fraudulent entity based upon the identification of the fraudulent entity, etc.).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the teachings of Black to include determine a set of operations to resolve the set of anomalies associated with the system; and execute the determined set of operations on the media data as taught by Olson. One ordinary skill in the art would be motivated to utilize the teachings of Black in the Olson system in order to provide for more accurately identifying fraudulent entities performing fraudulent activity ([paragraph 0090] in Olson).
9. Claims 5-13 and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Black et al. (US 2020/0195662 A1); in view of Olson et al. (US 2023/0156024 A1); and further in view of Bonafonte et al. (US 2023/0113297 A1).
Regarding Claim 5, the combination of Black and Olson teaches the computer-implemented method, wherein the set of operations comprises at least one of an encoding operation, a decoding operation, a backup operation, a session re-initiation operation, a bandwidth throttling operation, a rate limiting operation, or a load balancing operation (Olson: [paragraph 0111, 0117, 0125-0126] describes various operations includes reduction in bandwidth, a rate limit operation etc.).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the teachings of Black to include the set of operations comprises at least one of an encoding operation, a decoding operation, a backup operation, a session re-initiation operation, a bandwidth throttling operation, a rate limiting operation, or a load balancing operation as taught by Olson. One ordinary skill in the art would be motivated to utilize the teachings of Black in the Olson system in order to provide a rate of content item presentations per unit of time ([paragraph 0054] in Olson).
Black and Olson fails to teach wherein the set of operations comprises an encoding operation.
However, Bonafonte teaches wherein the set of operations comprises an encoding operation ([paragraph 0068-0069] describes various operations include an encoding operation).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the teachings of Black/Olson to include the set of operations comprises an encoding operation as taught by Bonafonte. One ordinary skill in the art would be motivated to utilize the teachings of Black/Olson in the Bonafonte system in order to provide a linguistic encoder processes input text data to determine first encoded data representing an input such as an utterance ([paragraph 0020] in Bonafonte).
Regarding Claim 6, the combination of Black, Olson and Bonafonte teaches the computer-implemented method, wherein the encoding operation comprises: controlling, by the computer, the first entity to obtain audio data associated with the media data, wherein the audio data comprises at least a first speech of a participant associated with the first entity (Bonafonte: [paragraph 0018-0019, 0035, 0068-0069] describes encoding operation controlling by the system any entity (e.g. first entity) to obtain audio data associated with media data and audio data includes a speech of a user (e.g. participant) associated with any entity (e.g. the first entity)); and
controlling, by the computer, the first entity to generate text data comprising at least a text corresponding to the first speech of the participant associated with the first entity, wherein the text data is generated based on the obtained audio data, and wherein a size of the text data is less than a size of the audio data (Bonafonte: [paragraph 0018-0019, 0035, 0068-0069] describes controlling by the system any entity (e.g. first entity) to output text data corresponding to the speech of the user (e.g. participant) associated with entity the output text data is based on the obtained audio data, and wherein a size of audio data is more than the text data).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the teachings of Black/Olson to include wherein the encoding operation comprises: controlling, by the computer, the first entity to obtain audio data associated with the media data, wherein the audio data comprises at least a first speech of a participant associated with the first entity; and controlling, by the computer, the first entity to generate text data comprising at least a text corresponding to the first speech of the participant associated with the first entity, wherein the text data is generated based on the obtained audio data, and wherein a size of the text data is less than a size of the audio data as taught by Bonafonte. One ordinary skill in the art would be motivated to utilize the teachings of Black/Olson in the Bonafonte system in order to provide a linguistic encoder processes input text data to determine first encoded data representing an input such as an utterance ([paragraph 0020] in Bonafonte).
Regarding Claim 7, the combination of Black, Olson and Bonafonte teaches the computer-implemented method, wherein the text data further comprises a set of speech characteristics associated with the first speech of the participant, and wherein the set of speech characteristics comprises at least one of a tone of the first speech, a pitch of the first speech, a rate of the first speech, an intensity of the first speech, a total number of words in the first speech, an accent in the first speech, or a pattern of pauses in the first speech (Bonafonte: [paragraph 0018-0019] describes text data includes speech characteristics associated with speech of the user (e.g. participant) and speech characteristics include tone, speech rate, emphasis, and/or accent, etc., of words represented in the audio data).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the teachings of Black/Olson to include wherein the text data further comprises a set of speech characteristics associated with the first speech of the participant, and wherein the set of speech characteristics comprises at least one of a tone of the first speech, a pitch of the first speech, a rate of the first speech, an intensity of the first speech, a total number of words in the first speech, an accent in the first speech, or a pattern of pauses in the first speech as taught by Bonafonte. One ordinary skill in the art would be motivated to utilize the teachings of Black/Olson in the Bonafonte system in order to emphasize particular words ([paragraph 0018] in Bonafonte).
Regarding Claim 8, the combination of Black, Olson and Bonafonte teaches the computer-implemented method, wherein the encoding operation further comprises: controlling, by the computer, the first entity to generate the text data, wherein the text data is generated based on an application of a set of machine learning (ML) models on the obtained audio data (Bonafonte: [paragraph 0018, 0038-0040, 0049] describes controlling by the system the entity to generate the text data based on an application of various machine learning models such as trained neural-network models, for performing various functions associated with speech processing on the obtained audio data).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the teachings of Black/Olson to include controlling, by the computer, the first entity to generate the text data, wherein the text data is generated based on an application of a set of machine learning (ML) models on the obtained audio data as taught by Bonafonte. One ordinary skill in the art would be motivated to utilize the teachings of Black/Olson in the Bonafonte system in order to process audio data (and/or other input data) corresponding to a command and determine corresponding output data, which may be text data, audio data, and/or video data ([paragraph 0018] in Bonafonte).
Regarding Claim 9, the combination of Black, Olson and Bonafonte teaches the computer-implemented method, wherein the set of ML models comprises a first ML model trained to generate the text data, and wherein the text data is generated based on at least the first speech of the participant included in the obtained audio data (Bonafonte: [paragraph 0033, 0038-0040, 0049] describes various machine learning models such as trained neural-network models includes a first trained model (e.g. ML Model) trained to generate the text data based on speech or utterance of the user (e.g. participant) included in the obtained audio data).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the teachings of Black/Olson to include wherein the set of ML models comprises a first ML model trained to generate the text data, and wherein the text data is generated based on at least the first speech of the participant included in the obtained audio data as taught by Bonafonte. One ordinary skill in the art would be motivated to utilize the teachings of Black/Olson in the Bonafonte system in order to process audio data (and/or other input data) corresponding to a command and determine corresponding output data, which may be text data, audio data, and/or video data ([paragraph 0018] in Bonafonte).
Regarding Claim 10, the combination of Black, Olson and Bonafonte teaches the computer-implemented method, wherein the decoding operation comprises: generating, by the computer, natural audio data based on the generated text data, wherein the natural audio data comprises at least a second speech corresponding to the text included in the generated text data (Bonafonte: [paragraph 0031] describes decoding operation [Bonafonte: paragraph 0019-0021, 0033, 0038-0040, 0049] describes generating by the system (e.g. computer) natural-language understanding (NLU) audio data based on the generated text data and natural-language understanding (NLU) audio data includes additional speech (e.g. a second speech) corresponding to the text included in the generated text data).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the teachings of Black/Olson to include wherein the decoding operation comprises: generating, by the computer, natural audio data based on the generated text data, wherein the natural audio data comprises at least a second speech corresponding to the text included in the generated text data as taught by Bonafonte. One ordinary skill in the art would be motivated to utilize the teachings of Black/Olson in the Bonafonte system in order to perform various functions associated with speech processing ([paragraph 0018] in Bonafonte).
Regarding Claim 11, the combination of Black, Olson and Bonafonte teaches the computer-implemented method, wherein the decoding operation further comprises: generating, by the computer, the natural audio data based on the application of the set of ML models on the generated text data (Bonafonte: [paragraph 0018-0021, 0031, 0038-0040, 0049] describes decoding operation includes generating by the system the entity to generate the natural-language understanding (NLU) audio data based on an application of various machine learning models such as trained neural-network models, for performing various functions associated with speech processing on the generated text data),
wherein the set of ML models further comprises a second ML model trained to generate the natural audio data, and wherein the natural audio data is generated based on at least the text included in the generated text data (Bonafonte: [paragraph 0018-0021, 0038-0040, 0049] describes various machine learning models includes second trained model trained to generate the natural-language understanding (NLU) audio data based on the text included in the generated text data).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the teachings of Black/Olson to include wherein the decoding operation includes generating, by the computer, the natural audio data based on the application of the set of ML models on the generated text data, wherein the set of ML models further comprises a second ML model trained to generate the natural audio data which is generated based on at least the text included in the generated text data as taught by Bonafonte. One ordinary skill in the art would be motivated to utilize the teachings of Black/Olson in the Bonafonte system in order to process input data corresponding to a command from a user and determines output data corresponding to a response to the command ([paragraph 0002] in Bonafonte).
Regarding Claim 12, the combination of Black, Olson and Bonafonte teaches the computer-implemented method, further comprising: controlling, by the computer, the second entity to output at least the second speech corresponding to the text included in the generated text data (Bonafonte: [paragraph 0018-0021, 0038-0044] describes controlling by the system (e.g. computer) another user (e.g. second entity) to output a second synthesized voice (e.g. speech) corresponding to the text included in the generated text data).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the teachings of Black/Olson to include controlling, by the computer, the second entity to output at least the second speech corresponding to the text included in the generated text data as taught by Bonafonte. One ordinary skill in the art would be motivated to utilize the teachings of Black/Olson in the Bonafonte system in order to determine scores indicating whether the audio data originated from a particular user or speaker ([paragraph 0041] in Bonafonte).
Regarding Claim 13, the combination of Black, Olson and Bonafonte teaches the computer-implemented method, wherein the set of ML models is trained based on training data (Bonafonte: [paragraph 0018-0021] describes various machine learning models is trained based on training data),
wherein the training data comprises at least one of a first data set comprising historical data associated with historical communication events between the first entity and the second entity over the network, or a second data set comprising training speech data associated with the participant (Bonafonte: [paragraph 0018-0021, 0029, 0038-0039] describes the training data includes second data set comprising training speech or utterance data associated with the user (e.g. participant)).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the teachings of Black/Olson to include wherein the set of ML models is trained based on training data and wherein the training data comprises at least one of a first data set comprising historical data associated with historical communication events between the first entity and the second entity over the network, or a second data set comprising training speech data associated with the participant as taught by Bonafonte. One ordinary skill in the art would be motivated to utilize the teachings of Black/Olson in the Bonafonte system in order to determine better approximate the variations that occur naturally in human speech ([paragraph 0021] in Bonafonte).
Regarding claim 17, this claim contains limitations found within that of claim 5 and the same rationale to rejection is used.
Regarding Claim 18, the combination of Black, Olson and Bonafonte teaches the system, wherein to execute the encoding operation the processor set is further configured to: control the second entity to obtain audio data associated with the media data, wherein the audio data comprises at least a first speech of a participant associated with the second entity (Bonafonte: [paragraph 0018-0019, 0035, 0068-0069] describes encoding operation controlling by the system any entity (e.g. second entity) to obtain audio data associated with media data and audio data includes a speech of a user (e.g. participant) associated with any entity (e.g. the second entity)); and
generate text data based on the obtained audio data, wherein the text data comprises at least a text corresponding to the first speech of the participant associated with the second entity, and wherein a size of the text data is less than a size of the audio data (Bonafonte: [paragraph 0018-0019, 0035, 0068-0069] describes controlling by the system any entity (e.g. second entity) to output text data corresponding to the speech of the user (e.g. participant) associated with entity the output text data is based on the obtained audio data, and wherein a size of audio data is more than the text data).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the teachings of Black/Olson to include wherein the encoding operation comprises: control the second entity to obtain audio data associated with the media data, wherein the audio data comprises at least a first speech of a participant associated with the second entity and generate text data based on the obtained audio data, wherein the text data comprises at least a text corresponding to the first speech of the participant associated with the second entity, and wherein a size of the text data is less than a size of the audio data as taught by Bonafonte. One ordinary skill in the art would be motivated to utilize the teachings of Black/Olson in the Bonafonte system in order to provide a linguistic encoder processes input text data to determine first encoded data representing an input such as an utterance ([paragraph 0020] in Bonafonte).
Regarding Claim 19, the combination of Black, Olson and Bonafonte teaches the system, wherein to execute the decoding operation the processor set is further configured to: control the first entity to generate natural audio data comprising at least a second speech corresponding to the text included in the generated text data, wherein the natural audio data is generated based on the generated text data (Bonafonte: [paragraph 0031] describes decoding operation [Bonafonte: paragraph 0019-0021, 0033, 0038-0040, 0049] describes generating by the system (e.g. computer) natural-language understanding (NLU) audio data based on the generated text data and natural-language understanding (NLU) audio data includes additional speech (e.g. a second speech) corresponding to the text included in the generated text data).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the teachings of Black/Olson to include wherein to execute the decoding operation the processor set is further configured to: control the first entity to generate natural audio data comprising at least a second speech corresponding to the text included in the generated text data, wherein the natural audio data is generated based on the generated text data as taught by Bonafonte. One ordinary skill in the art would be motivated to utilize the teachings of Black/Olson in the Bonafonte system in order to perform various functions associated with speech processing ([paragraph 0018] in Bonafonte).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
- Alperin et al. US 12326869 B1, A system for delivering contextual responses through dynamic integrations of digital information repositories with inquiries.
- Crabtree et al., US 20240386015 A1, A semantic search system integrates with an AI platform to provide advanced search capabilities by leveraging automatically generated ontologies and knowledge graphs.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MEHULKUMAR J SHAH whose telephone number is (571)272-1072. The examiner can normally be reached Mon-Fri, 6:05 am-3:55 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, TONIA DOLLINGER can be reached at 571-272-4170. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/M.J.S/Examiner, Art Unit 2459 /TONIA L DOLLINGER/Supervisory Patent Examiner, Art Unit 2459