DETAILED ACTION
This is responsive to the application filed 28 March 2024.
Claims 1-20 are pending and considered below.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Further, the judicial exception is not integrated into a practical application.
In claims 1, 11 and 20, the limitations receiving,
That is, other than reciting a “transcript production system” (claims 1, 11 and 20), a “system comprising: sensors located at a secure location; a processor operatively coupled the sensors; a memory device that stores instructions that, when executed by the processor, causes the system to” (claim 11) and a “product comprising: a computer-readable storage device that stores executable code that, when executed by a processor, causes the product to” (claim 20) nothing in the claims precludes the steps from practically being performed in the mind. For example, a person may receive voice input, generated during a learning session, from a user; produce, from the received voice input, a transcript of the received voice input (e.g. a human may listen to a lecture and produce a transcription of a professor’s speech); and perform an action with respect to the transcript as the transcript is produced (e.g. a human may translate or summarize the transcription in real-time).
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claims recite the additional elements – a “transcript production system” (claims 1, 11 and 20), a “system comprising: sensors located at a secure location; a processor operatively coupled the sensors; a memory device that stores instructions that, when executed by the processor, causes the system to” (claim 11) and a “product comprising: a computer-readable storage device that stores executable code that, when executed by a processor, causes the product to” (claim 20) which are recited at a high-level of generality (i.e., as generic processors performing generic computer functions) such that they amount to no more than mere instructions to apply the exception using a generic computer components.
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claims are therefore directed to an abstract idea.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements when considered both individually and as an ordered combination do not amount to significantly more than the abstract idea. As stated above, the claims recite the additional limitations of a “transcript production system” (claims 1, 11 and 20), a “system comprising: sensors located at a secure location; a processor operatively coupled the sensors; a memory device that stores instructions that, when executed by the processor, causes the system to” (claim 11) and a “product comprising: a computer-readable storage device that stores executable code that, when executed by a processor, causes the product to” (claim 20). However, these are recited at a high level of generality and are recited as performing generic computer functions routinely used in computer applications (see Applicant’s specification [0022], [0024], [0030]-[0034] and [0067]-[0071]). Generic computer components recited as performing generic computer functions that are well-understood, routine and conventional activities amount to no more than implementing the abstract idea with a computerized system.
Thus, taken alone, the additional elements do not amount to significantly more than the above-identified judicial exception (the abstract idea). Looking at the limitations as an ordered combination adds nothing that is not already present when looking at the elements taken individually. There is no indication that the combination of elements improves the functioning of a computer or improves any other technology. Their collective functions merely provide conventional computer implementation. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claims are not patent eligible.
The dependent claims, when analyzed as a whole, are held to be patent ineligible under 35 U.S.C. 101 because the additional recited limitations fail to establish that the claims are not directed to an abstract idea.
The dependent claims recite:
wherein the producing is performed in real-time as the voice input is provided (e.g. a human may transcribe in real-time);
wherein the performing an action comprises dynamically altering the transcript into an updated version of the transcript and providing the updated version to a user device in real-time (e.g. a human may translate the transcription);
wherein the performing an action comprises identifying an action from a user command received (e.g. a human may translate to a user selected language);
wherein the performing an action comprises translating the transcript to a language different than the transcript produced from the received voice input (e.g. a human may translate the transcription);
wherein the performing an action comprises providing the transcript to an artificial intelligence model (AI model is recited at a high-level of generality (i.e., as generic processors performing generic computer functions) such that it amounts to no more than mere instructions to apply the exception using a generic computer components; also the model merely performs generic computer functions routinely used in computer applications);
comprising generating a virtual agent that utilizes the artificial intelligence model to respond to input provided by a user( virtual agent is recited at a high-level of generality (i.e., as generic processors performing generic computer functions) such that it amounts to no more than mere instructions to apply the exception using a generic computer components; also the agent merely performs generic computer functions routinely used in computer applications);
wherein the performing an action comprises dynamically altering, using the artificial intelligence model, the transcript of the voice input into an updated version having different characteristics than the transcript (e.g. a human may translate the transcription);
wherein the performing an action comprises identifying a topic contained within the transcript and displaying, on a user device, secondary content related to the topic and obtained from a secondary source (e.g. a human may identify transcription topic and find related text from a source. Displaying is extra-solution activity that does not integrate the abstract idea into a practical solution. Further, displaying data is routinely performed in the computing field and does not represent significantly more than the abstract idea);
wherein the performing an action comprises summarizing content contained within the transcript (e.g. a human may summarize the transcription).
Therefore, the additional recited limitations further narrow the steps of the independent claims without however providing “a practical application of” or "significantly more than" the underlying “Mental Processes” abstract idea. Therefore, the dependent claims are also not patent eligible.
Moreover, see Recentive Analytics, Inc. v. Fox Corp. (Fed. Cir. April 18, 2025)- “Machine learning is a burgeoning and increasingly important field and may lead to patent-eligible improvements in technology. Today, we hold only that patents that do no more than claim the application of generic machine learning to new data environments, without disclosing improvements to the machine learning models to be applied, are patent ineligible under § 101.”
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1-9 and 11-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Nguyen et al. (US 2021/0174787).
Claim 1:
Nguyen discloses a method, the method comprising:
receiving, at a transcript production system, voice input, generated during a learning session, from a user; producing, from the received voice input and utilizing the transcript production system, a transcript of the received voice input (“Speaking user 108 speaks and that audio signal is received by computing device 110. Computing device 110 sends the audio signal to STT service 122. STT service 122 analyzes the audio signal and generates a textual transcription based on that analysis”, [0032], see also “a lecture that speaking user 108 is giving”, [0033]); and
performing, utilizing the transcript production system, an action with respect to the transcript as the transcript is produced (“translation service 124 may translate the transcription to one or more other languages that the audio was not originally received in”, [0032], note that the combination of STT service 122 and translation service 124 reads on Applicant’s transcript production system).
Claim 2:
Nguyen discloses the method of claim 1, wherein the producing is performed in real-time as the voice input is provided ([0034]).
Claim 3:
Nguyen discloses the method of claim 2, wherein the performing an action comprises dynamically altering the transcript into an updated version of the transcript and providing the updated version to a user device in real-time (“translation service 124 may translate the transcription to one or more other languages that the audio was not originally received in”, [0032], see also [0033]).
Claim 4:
Nguyen discloses the method of claim 1, wherein the performing an action comprises identifying an action from a user command received (“a user associated with computing device 104A has selected English as her preferred language for receiving a transcription of the transcription instance”, [0034], see also “a selection has been made of translation language element 612 in transcription pane 606. In this example, the selection is made via a mouse click on translation language element 612. However, other selection mechanisms are contemplated (e.g., touch input, voice input, etc.). Upon selection of translation language element 612, a plurality of selectable elements for modifying the language that captions 608 are surfaced in is caused to be displayed.”, [0049]).
Claim 5:
Nguyen discloses the method of claim 1, wherein the performing an action comprises translating the transcript to a language different than the transcript produced from the received voice input ([0032]).
Claim 6:
Nguyen discloses the method of claim 1, wherein the performing an action comprises providing the transcript to an artificial intelligence model (supervised machine learning model 204 and/or neural network 206) (“Translation service 224 includes one or more language processing models that may be utilized in translating output received from STT service 222. Those models are illustrated by supervised machine learning model 204, neural network 206, and language processing model 208”, [0040]).
Claim 7:
Nguyen discloses the method of claim 6, comprising generating a virtual agent (productivity application) that utilizes the artificial intelligence model to respond to input provided by a user ([0048]-[0049], note that translation is based on a particular language selected by the user).
Claim 8:
Nguyen discloses the method of claim 6, wherein the performing an action comprises dynamically altering, using the artificial intelligence model, the transcript of the voice input into an updated version having different characteristics than the transcript ([0032], see also [0040]).
Claim 9:
Nguyen discloses the method of claim 1, wherein the performing an action comprises identifying a topic contained within the transcript and displaying, on a user device, secondary content related to the topic and obtained from a secondary source (“a determination may be made as to a subject matter type that the transcription relates to. In such examples, the productivity application may surface one or more saved notes that relate to the subject matter of the transcription”, [0065]).
Claims 11-18:
Nguyen discloses a system, the system comprising: sensors located at a secure location; a processor operatively coupled the sensors; a memory device that stores instructions that, when executed by the processor ([0082], see also [0090]-[0091]), causes the system to perform the steps of process claims 1-3 and 5-9 as shown above.
Claim 20:
Nguyen discloses a product, the product comprising: a computer-readable storage device that stores executable code that, when executed by a processor ([0094]), causes the product to perform the steps of process claim 1 as shown above.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 10 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Nguyen et al. (US 2021/0174787) in view of Merril et al. (US 2007/0033528).
Claim 10:
Nguyen discloses the method of claim 1, but does not explicitly disclose wherein the performing an action comprises summarizing content contained within the transcript.
In an analogous system similarly performing an action with respect to a transcript, Merril et al. (US 2007/0033528) discloses wherein the performing an action comprises summarizing content contained within the transcript (“voice recognition can be performed on the lecture audio using a product such as Naturally Speaking.TM. available from Dragon Systems of Newton, Mass. The optical and voice recognition processes can be used to generate text documents containing full transcripts of both slide content and audio of an actual lecture. In another implementation, these transcripts can be processed by outline-generating software, such as LinguistX.TM. from InXight of Palo Alto, Calif., which can summarize the lecture transcripts”, [0061]).
It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to combine the references to yield the predictable result of wherein Nguyen’s performing an action comprises summarizing content contained within the transcript so that “automatic summarization and outlining software can be applied to the transcripts to create indexes and outlines that are easily searchable by an end-user” (see Merril, [0044]).
Claim 19:
Nguyen in view of Merril discloses a system, the system comprising: sensors located at a secure location; a processor operatively coupled the sensors; a memory device that stores instructions that, when executed by the processor ([0082], see also [0090]-[0091]), causes the system to perform the steps of process claim 10 as shown above.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Saindon et al. (US 2002/0161579) discloses systems and methods for receiving live speech, converting the speech to text, and transferring the text to a user. As desired, the speech or text can be translated into one or more different languages. Systems and methods for real-time conversion and transmission of speech and text are provided.
Stoker et al. (US 2020/0321007) discloses a real-time audio transcription, video conferencing, and online collaboration system which includes a microphone that records audio from a lecturer, a camera that captures video from the lecturer and/or users, and a user interface for viewing a transcription of the audio. The audio recording is transmitted to a storage device, such as a personal computer or mobile device, which transmits the audio to a voice-to-text application for transcription. The storage device may do so via a third-party cloud server, a web application, or a software application. The transcribed text is then transmitted to a user interface for viewing by hearing-impaired persons. The transcribed text is provided in real-time with the lecturer and audio-recording and presented word-for-word to the user during transcription and may be edited in real-time to improve accuracy of the automatic transcription.
Raina (US 2022/0300719) discloses a system for generating a multilingual transcript from a multilingual audio input. The system includes a processor being configured receive, from a source, a set of first signals pertaining to the multilingual audio input. Extract, based on the set of first signals, one or more attributes of the multilingual audio input, and correspondingly generate a set of second signals. Convert, based on the set of second signals, the multilingual audio input in to a plurality of monolingual transcripts having respective plurality of segments. The plurality of segments is associated with a plurality of languages present in the multilingual audio input.
Kelly et al. (US 2023/0154465) discloses a method of analyzing instructor discourse which includes recording an audio signal representing speech of the instructor during a class session, converting the audio signal to a session transcript comprising speech data for the session using an automatic speech recognition tool and segmenting the transcript into utterances, extracting a set of features from the session transcript, filtering student talk out from the utterances, analyzing a first subset of the features to produce a number of local context predictions for each utterance of the session transcript, analyzing a second subset of the features to produce a number of global context predictions for the session transcript, and combining a subset of the number of local context predictions and the number of global context predictions into a classification that attends to differential reliability.
Li et al. (US 2024/0412737) discloses receiving a sound signal from a lecture room, the sound signal including speech of a lecturer. A plurality of students in the lecture room are watching a lecturer explain a graph which is displayed on a screen on a wall of the room. Some of the students have their own laptop computer, tablet computer or other portable computing device in order to view a transcription of the lecturer's speech to facilitate their learning.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SAMUEL G NEWAY whose telephone number is (571)270-1058. The examiner can normally be reached Monday-Friday 9:00am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached at 571-272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SAMUEL G NEWAY/ Primary Examiner, Art Unit 2657