Last updated: May 29, 2026
Application No. 18/687,676
SYSTEM AND METHOD OF PROVIDING ARTIFICIAL INTELLIGENCE-BASED SURGICAL RESULT REPORT USING VOICE RECOGNITION PLATFORM

Non-Final OA §101§103
Filed
Oct 29, 2024
Priority
Aug 30, 2021 — RE 10-2021-0114807 +1 more
Examiner
LEE, ANDREW ELDRIDGE
Art Unit
3684
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
Industry Academic Cooperation Foundation Keimyung University
OA Round
1 (Non-Final)
This examiner grants 17% of cases after interview

— +33.0% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 132 resolved cases, 2023–2026
Examiner Intelligence

LEE, ANDREW ELDRIDGE View full profile →
Grants only 17% of cases
Career Allowance Rate
23 granted / 132 resolved
-34.6% vs TC avg
Strong +33% interview lift
Without
With
+33.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 9m
Avg Prosecution
20 currently pending
Career history
176
Total Applications
across all art units
Statute-Specific Performance

§101
4.7%
-35.3% vs TC avg
§103
75.8%
+35.8% vs TC avg
§102
18.5%
-21.5% vs TC avg
§112
0.4%
-39.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 132 resolved cases
Office Action

§101 §103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
In the preliminary response filed on 28 February 2024, the following has occurred: claims 1-10 have been amended.
Now claims 1-10 are pending.

Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, 365(c), or 386(c) is acknowledged. 

Information Disclosure Statement
The Information Disclosure Statement(s) filed on 28 February 2024, has been considered by the Examiner.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
standard question generation unit in claim 1
a question-and-answer generation unit in claim 1
a learning unit in claim 1
a control unit in claim 1
a natural language understanding (NLU) module in claim 4
a voice synthesis module in claim 4
a receiving unit in claim 5
an automatic speech recognition (ASR) module in claim 5
a classification unit in claim 6
The various unit’s and modules are being reads from Figure 2, paragraphs [0025] and [0056], as a combination of software implemented by a server.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-10 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more
Claims 1 and 8 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claims recite system and method using a human’s user’s voice to provide a surgical report. The limitations of:
Claim 1, which is representative of claim 8
[…] analyze a standard surgical result report including surgical procedures and surgical and diagnostic results for a specific medical department to [… organize …] the surgical procedures and the surgical and diagnostic results into string data, and extracts keywords from the string data to generate respective diagnostic standard question data including medical information; […] extract keywords from response areas included in the respective diagnostic standard question data; […] using the respective diagnostic standard question data and response data corresponding thereto as learning data, wherein the respective diagnostic standard question data comprises response selection numbers selected in response to questions; and [… organize …] the respective diagnostic standard question data into user voice signals to [… provide …] the converted voice signals to a user […], [… obtain …] the response selection numbers as voice signals from the user […], and outputs response data corresponding to the response selection numbers […].
, as drafted, is a system, which under its broadest reasonable interpretation, covers a method of organizing human activity (i.e., managing personal behavior including following rules or instructions) via human interaction with generic computer components. That is, by a human user interacting with various units and a user terminal, the claimed invention amounts to managing personal behavior or interaction between people, the Examiner notes as stated in 2106.04(a)(2), “certain activity between a person and a computer… may fall within the “certain methods of organizing human activity” grouping”. For example, by human interaction with various units and a user terminal, the claim encompasses organizing questions and answers to be provided to a human user for the human user to create a surgical report using their voice and the organized questions and answers. If a claim limitation, under its broadest reasonable interpretation, covers managing personal behavior or interactions between people but for the recitation of generic computer components, then it falls within the “certain method of organizing human activity” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements of various units and a user terminal, which implements the abstract idea. The various units and a user terminal are recited at a high-level of generality (i.e., a general-purpose computers/ computer components implementing generic computer functions; see Applicant’s Specification Figure 2, paragraphs [0025], [0056]) such that it amounts no more than mere instructions to apply the exception using generic computer components. Accordingly, these additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claim recites the additional elements of “convert…”, “a learning unit configured to learn by artificial intelligence in conjunction with an artificial neural processing network”, and “transmit… receive…”. The “convert…”is recited at a high-level of generality (i.e., transforming data in a generic manner) and amounts to generally linking the abstract idea to a particular technological environment. The “a learning unit configured to learn by artificial intelligence in conjunction with an artificial neural processing network” is recited at a high-level of generality (i.e., training and using an off-the-shelf machine learning algorithm in a generic manner) and amounts to generally linking the abstract idea to a particular technological environment. The “transmit… receive…” steps are recited at a high-level of generality (i.e., as a general means of receiving/transmitting data) and amounts to the mere transmission and/or receipt of data, which is a form of extra-solution activity. Accordingly, even in combination, these additional elements do not integrate the abstract idea into a practical application. The claim is directed to an abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of an various units and a user terminal, to perform the noted steps amounts to no more than mere instructions to apply the exception using generic hardware components. Mere instructions to apply an exception using generic hardware components cannot provide an inventive concept ("significantly more").
Also, as discussed above with respect to integration of the abstract idea into a practical application, the additional elements of “convert…”, “a learning unit configured to learn by artificial intelligence in conjunction with an artificial neural processing network” and “transmit… receive…” were considered extra-solution activity and/or generally linking the abstract idea to particular technological environment. The “convert…” has been re-evaluated under the “significantly more” analysis and determined to amount to be well-understood, routine, and conventional elements/functions. As described in Backes (20100169092): paragraph [0116] and claim 15; Chiang (20200176116): paragraphs [0019]-[0020]; Casella dos Santos (20130238329): paragraph [0040]; Jones (20190279647): paragraph [0064]; converting data in a generic manner; is well-understood, routine and conventional. The “a learning unit configured to learn by artificial intelligence in conjunction with an artificial neural processing network” has been re-evaluated under the “significantly more” analysis and determined to amount to be well-understood, routine, and conventional elements/functions. As described in Chiang (20200176116): paragraph [0020]; Jones (20190279647): paragraphs [0069], [0138]; Ganmukhi (20220028382): paragraphs [0067]-[69], [0089]; training and use of a machine learning model is well-understood, routine and conventional. The “transmit… receive…” steps have been re-evaluated under the "significantly more" analysis and determined to amount to be well-understood, routine, and conventional elements/functions. As described in MPEP 2106.0S(d)(II)(i) "Receiving or transmitting data over a network" is well-understood, routine, and conventional. Well-understood, routine, and conventional elements/functions cannot provide “significantly more.” As such the claim is not patent eligible.
Claims 2-7 and 9-10 are similarly rejected because either further define the abstract idea and/or do not further limit the claim to a practical application or provide as inventive concept such that the claims are subject matter eligible.
Claim 2 further describes generation of a final report, but does not recite any additional elements and therefore cannot provide a practical application and/or significantly more.
Claims 3, 6 and 10, further describe various layers of supervised-learning and classification of a result, however the training and use of artificial neural processing network was already considered above and is incorporated herein.
Claims 4-6, further describes the conversion of data using various modules and classification of data using a classification unit, however the various units/modules are recited at a high-level of generality (i.e., a general-purpose computers/ computer components implementing generic computer functions; see Applicant’s Specification Figure 2, paragraphs [0025], [0056]) such that it amounts no more than mere instructions to apply the exception using generic computer components. Accordingly, these additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of various units/modules, to perform the noted steps amounts to no more than mere instructions to apply the exception using generic hardware components. Mere instructions to apply an exception using generic hardware components cannot provide an inventive concept ("significantly more").
Claims 7 and 9 further describe labeling of data, however labeling of data is at best organization of data and is not an additional element, sufficient to show a practical application and/or significantly more.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 1-2 and 4-9 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent App. No. 20100169092 (hereafter “Backes”), in view of U.S. Patent App. No. 20200176116 (hereafter “Chiang”), further in view of U.S. Patent App. No. 2013/0238329 (hereafter “Casella dos Santos”).

Regarding (Currently Amended) claim 1, Backes teaches a system of providing [… a …] surgical result report using a voice recognition platform (Backes: Fig. 1, paragraph [0023], “The voice interface OCX can be suitable for any area of medicine… related surgical or diagnostic specialties, and general medicine”, paragraph [0028], “While a user is working on generating a report through a clinical application system 120 using a voice technology, such as a voice recognition engine 160”), the system comprising:
a standard question generation unit configured to analyze a standard surgical result report […], and extracts keywords from the string data to generate respective diagnostic standard question data including medical information; a question-and-answer generation unit configured to extract keywords from response areas included in the respective diagnostic standard question data (Backes: Fig. 1, paragraph [0023], “The voice interface OCX can be suitable for any area of medicine… related surgical or diagnostic specialties, and general medicine”, paragraph [0028], “build a customized medical dictation workflow system from a selection of available user application programs provided”, paragraphs [0033]-[0036], “discrete data elements 162 that are designated to be extracted from the selected form. The determination of which fields constitute discrete data elements is based on the reporting field map 166 as obtained from the clinical tracking system 180… A tracking data module 182 can store the data that is extracted from the report… extracting data from certain fields of a report or pre-defining fields for capturing the data in a report. The clinical tracking system 180 is equipped with information on what data should be extracted… The reporting field map 182 includes modules for field format 184, validation rules 186, validation data 188, and tracking fields that define properties or fields in a report for data capture. The field format module 184 can define fields in a report that data should be extracted from. The validation rules module 186 can provide certain rules for capturing data of certain properties, relating to labels, units, valid values, default values, and required or optional indicators. The validation data module 188 can store data regarding the validation rules, either data to construct or execute the rules or data captured from the report fields based on execution of the rules”, paragraph [0100], “it substitutes the string of text corresponding to the macro into the text file”, paragraph [0110], “options such as keyword”. The Examiner interprets that the keywords in a response form (i.e., a surgical report) are extracted to create a dictation (i.e., Q&A) workflow, which teaches what is required of the claim under the broadest reasonable interpretation. Additionally, the Examiner notes that “to respectively generate diagnostic standard question” is an intended use of the extraction of keywords that is not required to occur. This feature has been fully considered by the Examiner; however, the limitation does not provide patentable distinction over the cited prior art because it is an intended use or result of the extraction of keywords. Additionally, the Examiner notes that a claim may be rendered obvious where the limiting function is that of making a set of prior-known elements contiguous, i.e., bringing them together. However, the opposite is also true. In this case, the limiting function is that of splitting prior-known elements and or functionality into discrete elements: (a) a standard question generation unit and (b) a question-and-answer generation unit, the clinical tracking system that extracts text from a report as taught by Backes teach the functionality of the claimed elements respectively. As such, this claim would be obvious to one of ordinary skill in the art at the time of the invention to make the clinical tracking system that extracts text from a report of Crossen separable without undue experimentation or risk of unexpected results, see In re Dulberg, 289 F.2d 522, 523, 129 USPQ 348, 349 (CCPA 1961). MPEP 2144.04); […]; and
[…] transmit the converted voice signals to a user terminal (Backes: paragraph [0025], “a computer system to obtain a medical dictation workflow system. The computer system can be a personal computer, workstation, or a handheld computing device… The input device may be any device for inputting commands to a computer, which can include one or more of any of the following: keyboard, keypad, infrared transmitter, microphone, voice detector, pointing device, light pen, mouse, touch screen, or stylus. The output device may comprise, for example, a display, such as a monitor, a speechmike, a speaker”), 
receives the response […] as user voice signals from the user terminal, and outputs response data corresponding to the response […] (Backes: Figure 1, paragraph [0028], “The voice interface OCX can incorporate voice technologies into the system, such that the various application programs can process voice data from the voice technologies… While a user is working on generating a report through a clinical application system 120 using a voice technology”, paragraph [0033], “generate the final report”, paragraph [0135], “a first user (e.g., any network user) from any networked workstation in communication with the server can add dictation to any given field in any given note or form”).
Backes may not explicitly teach (underlined below for clarity):
a system of providing an artificial intelligence-based surgical result report using a voice recognition platform, the system comprising:
a learning unit configured to learn by artificial intelligence in conjunction with an artificial neural processing network using the respective diagnostic standard question data and response data corresponding thereto as learning data,
wherein the respective diagnostic standard question data comprising response selection numbers selected in response to questions; and
a control unit configured to convert the respective diagnostic standard question data into voice signals to transmit the converted voice signals to a user terminal,
receives the response selection numbers as user voice signals from the user terminal, and outputs response data corresponding to the response selection numbers through the artificial neural processing network.
Chiang teaches a system of providing an artificial intelligence-based surgical result report using a voice recognition platform (Chiang: Figure 1, paragraph [0020], “analysis are achieved by neural network or machine learning”, paragraphs [0040]-[0042], “the system receives the response information of the user's voice, image, or physiological measurement signal through the multimodal data input interface 204. For example, the voice data is received through the sensing module”, paragraph [0045], “the health status assessment module 108 outputs an evaluation report based on the data”), the system comprising:
a learning unit configured to learn by artificial intelligence in conjunction with an artificial neural processing network using the respective diagnostic standard question data and response data corresponding thereto as learning data (Chiang: Figure 1, paragraph [005], “selecting a health status assessment question from a dialog design database and presenting the health status assessment question… judging whether the interactive end signal is received; when the interactive end signal has been received, the health status evaluation program outputs an evaluation report according to the data in the temporary storage space; when the interactive end signal is not received, select another health status assessment question in the follow-up item”, paragraph [0020], “analysis are achieved by neural network or machine learning”. The Examiner notes a neural network is used to learn question and responses for interactive health assessment);
wherein the respective diagnostic standard question data comprising response selection numbers selected in response to questions (Chiang: Figure 2, paragraph [0016], “each health status assessment question corresponds to a number and a question script”, paragraph [0043], “When the response information belongs to the numerical type, referring to step S22, the health status evaluation module 108 generates a first evaluation result according to the response information and the preset numerical rule”), 
a control unit configured to convert the respective diagnostic standard question data into voice signals to transmit the converted voice signals to a user terminal (Chiang: paragraph [0017], “The data output interface 202 electrically connects and communicates with the dialog design database 102, and the data output interface 202 is used to present selected health status assessment questions to the user by visual or audio ways. In practice, the data output interface 202, such as a screen or a speaker, displays text or images, animated characters, or plays a selected health status evaluation question by voice”, paragraph [0031], “The communication module 109 converts the selected health status assessment question and the evaluation report into a data format”),
receives the response selection numbers as user voice signals from the user terminal, and outputs response data corresponding to the response selection numbers through the artificial neural processing network (Chiang: Figure 2, paragraph [0005], “the health status evaluation program outputs an evaluation report according to the data”, paragraph [0016], “each health status assessment question corresponds to a number and a question script”, paragraphs [0019]-0020], “the voice signal received by the microphone (multimodal data input interface 204) is converted into text… analysis are achieved by neural network or machine learning”, paragraphs [0040]-[042], “the system receives the response information of the user's voice, image, or physiological measurement signal through the multimodal data input interface 204. For example, the voice data is received through the sensing module”).
One of ordinary in the art before the effective filing date would have found it obvious to include using a neural network to learn selection numbers for questions in a report as taught by Chiang within the surgical dictation-based reporting as taught by Backes with the motivation of “improve and optimize the medical quality.” (Chiang: paragraph [0046]).
Backes and Chiang may not explicitly teach (underlined below for clarity):
a standard question generation unit configured to analyze a standard surgical result report including surgical procedures and, surgical and diagnostic results for a specific medical department to convert the surgical procedures and, surgical and diagnostic results into string data, and extracts keywords from the string data to generate respective diagnostic standard question data including medical information;
Casella dos Santos teaches a standard question generation unit configured to analyze a standard surgical result report including surgical procedures and, surgical and diagnostic results for a specific medical department to convert the surgical procedures and, surgical and diagnostic results into string data, and extracts keywords from the string data to generate respective diagnostic standard question data including medical information (Casella dos Santos: paragraphs [0019]-[0020], “setting forth the patient's diagnoses related to the surgical procedure… documenting the name(s) of the procedure(s) performed during the operation. In this case, two procedures were performed--an appendectomy, and lysis of adhesions. Section 106 then states the name(s) of one or more of the clinical personnel involved in performing the surgical procedure”, paragraph [0041], “a clinical language understanding ontology or other knowledge representation model utilized by fact extraction component 204, such that ASR engine 202 can produce a text transcription containing terms in a form understandable to fact extraction component 204”, paragraph [0047], “extracting clinical facts from the text transcription may be used”);
One of ordinary in the art before the effective filing date would have found it obvious to include using surgical reports with procedures and results as taught by Casella dos Santos within the surgical dictation-based reporting as taught by Backes and Chiang with the motivation of “improve the speech recognition process” (Casella dos Santos: paragraph [0042]).

Regarding (Currently Amended) claim 2, Backes, Chiang and Casella dos Santos teach the limitations of claim 1, and further teach wherein the control unit is further configured to generate a final surgical result report comprising surgical procedures and surgical and diagnostic results in consideration of the outputted response data (Backes: paragraphs [0029]-[0031], “The clinical application program 120 provides the user with several tools for generating a report… The core reporting system 140 processes data from the voice engine 160 received through its audio input 142 in order to generate a report through its report generator 144”, paragraph [0034], “A final form reports module 178 can store the report with the data in its final form”; Casella dos Santos: paragraphs [0019]-[0020], “setting forth the patient's diagnoses related to the surgical procedure… documenting the name(s) of the procedure(s) performed during the operation. In this case, two procedures were performed--an appendectomy, and lysis of adhesions. Section 106 then states the name(s) of one or more of the clinical personnel involved in performing the surgical procedure”).
The motivation to combine is the same as in claim 1, incorporated herein.

Regarding (Currently Amended) claim 4, Backes, Chiang and Casella dos Santos teach the limitations of claim 1, and further teach wherein the control unit is further configured to: performs syntactic analysis or semantic analysis on the respective diagnostic standard question data using a natural language understanding (NLU) module to identify a meaning of a text string in a natural language (Chiang: paragraph [0020], “Grammar and semantic analysis are achieved by neural network or machine learning… the semantic understanding program converts the text into a language framework of ideas, from which the user's intentions and the keywords answered by the user are taken out”; Casella dos Santos: paragraph [0041], “a clinical language understanding ontology or other knowledge representation model utilized by fact extraction component 204, such that ASR engine 202 can produce a text transcription containing terms in a form understandable to fact extraction component 204”, paragraph [0073], “a syntactic and/or grammatical parser… generate a natural language narration”); 
converts the text string in the natural language into the voice signals by a voice synthesis module, and transmits the converted voice signal to the user terminal (Chiang: paragraph [0017], “The data output interface 202 electrically connects and communicates with the dialog design database 102, and the data output interface 202 is used to present selected health status assessment questions to the user by visual or audio ways. In practice, the data output interface 202, such as a screen or a speaker, displays text or images, animated characters, or plays a selected health status evaluation question by voice”, paragraph [0031], “The communication module 109 converts the selected health status assessment question and the evaluation report into a data format”).
The motivation to combine is the same as in claim 1, incorporated herein.

Regarding (Currently Amended) claim 5, Backes, Chiang and Casella dos Santos teach the limitations of claim 1, and further teach wherein the control unit further comprises a receiving unit configured to receive the response selection numbers included in the respective diagnostic standard question data as the user voice signals from the user terminal; and an automatic speech recognition (ASR) module configured to convert the user voice signals received from the receiving unit into text data (Chiang: Figure 2, paragraph [0005], “the health status evaluation program outputs an evaluation report according to the data”, paragraph [0016], “each health status assessment question corresponds to a number and a question script”, paragraphs [0019]-0020], “the identification module 104 is configured to execute an identification procedure… the voice signal received by the microphone (multimodal data input interface 204) is converted into text”, paragraphs [0040]-[043], “the system receives the response information of the user's voice, image, or physiological measurement signal through the multimodal data input interface 204. For example, the voice data is received through the sensing module… When the response information belongs to the numerical type, referring to step S22, the health status evaluation module 108 generates a first evaluation result according to the response information and the preset numerical rule”).
The motivation to combine is the same as in claim 1, incorporated herein.

Regarding (Currently Amended) claim 6, Backes, Chiang and Casella dos Santos teach the limitations of claim 5, and further teach a classification unit configured to output the response selection numbers, which are the text data, as result values of the response data through a deep learning-based classifier model using the artificial neural processing network (Chiang: paragraph [0005], “the health status evaluation program outputs an evaluation report according to the data”, paragraph [0016], “each health status assessment question corresponds to a number and a question script”, paragraphs [0019]-0020], “the voice signal received by the microphone (multimodal data input interface 204) is converted into text… analysis are achieved by neural network or machine learning”, paragraph [0026], “the system can make more comprehensive questions about the same intent category in order to gain a deeper understanding of the user's situation. In still another embodiment of the present disclosure, the control module 206 further calculates a user completion for each intent category”. The Examiner notes categories red on classes under the broadest reasonable interpretation).
The motivation to combine is the same as in claim 1, incorporated herein.

Regarding (Currently Amended) claim 7, Backes, Chiang and Casella dos Santos teach the limitations of claim 1, and further teach wherein the standard question generation unit generates the respective diagnostic standard question data by labeling the respective diagnostic standard question data with indices (Backes: paragraphs [0046]-[0048], “a synchronization and indexing file that synchronizes and indexes the sounds in the audio file to the text in the editable text file… The audio file and editable transcribed text file may be indexed by the indexing file, such that each transcribed word of dictation in the editable transcribed text file is referenced to a location, and thus a sound, in the associated audio file”; Chiang: paragraph [0016], “each health status assessment question corresponds to a number and a question script”).
The motivation to combine is the same as in claim 1, incorporated herein.

Regarding (Currently Amended) claim 8, Backes teaches a method of providing [… a …] surgical result report using a voice recognition platform (Backes: Fig. 1, paragraph [0023], “The voice interface OCX can be suitable for any area of medicine… related surgical or diagnostic specialties, and general medicine”, paragraph [0028], “While a user is working on generating a report through a clinical application system 120 using a voice technology, such as a voice recognition engine 160”), the method comprising: […],
[…] transmit the converted voice signals to a user terminal, and receiving the response selection numbers as user voice signals from the user terminal (Backes: paragraph [0025], “a computer system to obtain a medical dictation workflow system. The computer system can be a personal computer, workstation, or a handheld computing device… The input device may be any device for inputting commands to a computer, which can include one or more of any of the following: keyboard, keypad, infrared transmitter, microphone, voice detector, pointing device, light pen, mouse, touch screen, or stylus. The output device may comprise, for example, a display, such as a monitor, a speechmike, a speaker”); and
respectively generating response data corresponding to the received response selection numbers using diagnostic standard question data […], and generating a final surgical result report comprising […] the generated response data (Backes: Figure 1, paragraph [0028], “The voice interface OCX can incorporate voice technologies into the system, such that the various application programs can process voice data from the voice technologies… While a user is working on generating a report through a clinical application system 120 using a voice technology”, paragraph [0033], “generate the final report”, paragraph [0135], “a first user (e.g., any network user) from any networked workstation in communication with the server can add dictation to any given field in any given note or form”).
Backes may not explicitly teach (underlined below for clarity):
a method of providing an artificial intelligence-based surgical result report using a voice recognition platform, 
generating a plurality of diagnostic standard question data based on learning by artificial intelligence of a standard surgical result report, 
wherein the plurality of diagnostic standard question data comprises response selection numbers selected in response to questions;
converting the plurality of diagnostic standard question data into voice signals to transmit the converted voice signals to a user terminal, and receiving the response selection numbers as user voice signals from the user terminal;
respectively generating response data corresponding to the received response selection numbers using diagnostic standard question data learned by the artificial intelligence, and generating a final surgical result report comprising […] the generated response data.
Chiang teaches a method of providing an artificial intelligence-based surgical result report using a voice recognition platform (Chiang: Figure 1, paragraph [0020], “analysis are achieved by neural network or machine learning”, paragraphs [0040]-[0042], “the system receives the response information of the user's voice, image, or physiological measurement signal through the multimodal data input interface 204. For example, the voice data is received through the sensing module”, paragraph [0045], “the health status assessment module 108 outputs an evaluation report based on the data”), 
generating a plurality of diagnostic standard question data based on learning by artificial intelligence of a standard surgical result report (Chiang: Figure 1, paragraph [005], “selecting a health status assessment question from a dialog design database and presenting the health status assessment question… judging whether the interactive end signal is received; when the interactive end signal has been received, the health status evaluation program outputs an evaluation report according to the data in the temporary storage space; when the interactive end signal is not received, select another health status assessment question in the follow-up item”, paragraph [0020], “analysis are achieved by neural network or machine learning”. The Examiner notes a neural network is used to learn question and responses for interactive health assessment);
wherein the plurality of diagnostic standard question data comprises response selection numbers selected in response to questions (Chiang: Figure 2, paragraph [0016], “each health status assessment question corresponds to a number and a question script”, paragraph [0043], “When the response information belongs to the numerical type, referring to step S22, the health status evaluation module 108 generates a first evaluation result according to the response information and the preset numerical rule”), 
converting the plurality of diagnostic standard question data into voice signals to transmit the converted voice signals to a user terminal, and receiving the response selection numbers as user voice signals from the user terminal (Chiang: paragraph [0017], “The data output interface 202 electrically connects and communicates with the dialog design database 102, and the data output interface 202 is used to present selected health status assessment questions to the user by visual or audio ways. In practice, the data output interface 202, such as a screen or a speaker, displays text or images, animated characters, or plays a selected health status evaluation question by voice”, paragraph [0031], “The communication module 109 converts the selected health status assessment question and the evaluation report into a data format”);
respectively generating response data corresponding to the received response selection numbers using diagnostic standard question data learned by the artificial intelligence, and generating a final surgical result report comprising […] the generated response data (Chiang: Figure 2, paragraph [0005], “the health status evaluation program outputs an evaluation report according to the data”, paragraph [0016], “each health status assessment question corresponds to a number and a question script”, paragraphs [0019]-0020], “the voice signal received by the microphone (multimodal data input interface 204) is converted into text… analysis are achieved by neural network or machine learning”, paragraphs [0040]-[042], “the system receives the response information of the user's voice, image, or physiological measurement signal through the multimodal data input interface 204. For example, the voice data is received through the sensing module”).
One of ordinary in the art before the effective filing date would have found it obvious to include using a neural network to learn selection numbers for questions in a report as taught by Chiang within the surgical dictation-based reporting as taught by Backes with the motivation of “improve and optimize the medical quality.” (Chiang: paragraph [0046]).
Backes and Chiang may not explicitly teach (underlined below for clarity):
generating a final surgical result report comprising surgical procedures and surgical and diagnostic results based on the generated response data.
Casella dos Santos teaches generating a final surgical result report comprising surgical procedures and surgical and diagnostic results based on the generated response data (Casella dos Santos: paragraphs [0019]-[0020], “setting forth the patient's diagnoses related to the surgical procedure… documenting the name(s) of the procedure(s) performed during the operation. In this case, two procedures were performed--an appendectomy, and lysis of adhesions. Section 106 then states the name(s) of one or more of the clinical personnel involved in performing the surgical procedure”, paragraph [0041], “a clinical language understanding ontology or other knowledge representation model utilized by fact extraction component 204, such that ASR engine 202 can produce a text transcription containing terms in a form understandable to fact extraction component 204”, paragraph [0047], “extracting clinical facts from the text transcription may be used”).
One of ordinary in the art before the effective filing date would have found it obvious to include using surgical reports with procedures and results as taught by Casella dos Santos within the surgical dictation-based reporting as taught by Backes and Chiang with the motivation of “improve the speech recognition process” (Casella dos Santos: paragraph [0042]).

REGARDING CLAIM(S) 9
Claim(s) 9 is/are analogous to Claim(s) 7, thus Claim(s) 9 is/are similarly analyzed and rejected in a manner consistent with the rejection of Claim(s) 7.

Claim(s) 3 and 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent App. No. 2010/0169092 (hereafter “Backes”), U.S. Patent App. No. 2020/0176116 (hereafter “Chiang”), and U.S. Patent App. No. 2013/0238329 (hereafter “Casella dos Santos”) as applied to claims 1 and 8 above, and further in view of U.S. Patent App. No. 2019/0279647 (hereafter “Jones”).

Regarding (Currently Amended) claim 3, Backes, Chiang and Casella dos Santos teach the limitations of claim 1, but may not explicitly teach wherein the learning unit is further configured to: allow feature values of the respective diagnostic standard question data to become an input vector using the artificial neural processing network, and learns, when passing through an input layer, a hidden layer, and an output layer, through supervised-learning to generate response data included in the respective diagnostic standard question data as an output vector.
Jones teaches wherein the learning unit allows feature values of the respective diagnostic standard question data to become an input vector using the artificial neural processing network, and learns, when passing through an input layer, a hidden layer, and an output layer, through supervised-learning to generate response data included in the respective diagnostic standard question data as an output vector (Jones: paragraph [0178], “the user preference vector may also be based on clinical information (e.g., electronic user medical health records, user-reported symptoms, medical providers' notes”, paragraph [0184], “The mapping may be performed utilizing one or more techniques. For example, a phrase/word co-occurrence matrix, a neural network (such as a skip-gram neural network comprising an input layer, an output layer, and one or more hidden layers)”).
One of ordinary skill in the art before the effective filing date would have found it obvious to include using a vector and layers as taught by Jones with the learning for reporting as taught by Backes, Chiang and Casella dos Santos with the motivation of “improving interactive voice-based and/or text-based sessions so that they are more natural, interpret user queries more accurately, and generate query responses with greater accuracy” (Jones: paragraph [0019]).

REGARDING CLAIM(S) 10
Claim(s) 10 is/are analogous to Claim(s) 3, thus Claim(s) 10 is/are similarly analyzed and rejected in a manner consistent with the rejection of Claim(s) 3.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
U.S. Patent Pub. No. 20220028382 (hereafter “Ganmukhi”) teaches using voice assisted machine learning to fill in fields of a medical report.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Andrew E Lee whose telephone number is (571)272-8323. The examiner can normally be reached M-Th 9-5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Shahid Merchant can be reached on 571-270-1360. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/A.E.L./Examiner, Art Unit 3684                                                                                                                                                                                                        

/Shahid Merchant/Supervisory Patent Examiner, Art Unit 3684
Read full office action
Prosecution Timeline

Oct 29, 2024
Application Filed
Dec 22, 2025
Non-Final Rejection mailed — §101, §103
Mar 23, 2026
Response Filed
Precedent Cases

Applications granted by this same examiner with similar technology

17/457,628
Patent 12542210
WEARABLE DEVICE AND COMPUTER ENABLED FEEDBACK FOR USER TASK ASSISTANCE
4y 2m to grant Granted Feb 03, 2026
15/621,324
Patent 12154077
USER INTERFACE FOR DISPLAYING PATIENT HISTORICAL DATA
7y 5m to grant Granted Nov 26, 2024
16/093,894
Patent 12040070
RADIOTHERAPY SYSTEM, DATA PROCESSING METHOD AND STORAGE MEDIUM
5y 9m to grant Granted Jul 16, 2024
15/436,180
Patent 12027251
SYSTEMS AND METHODS FOR MANAGING LARGE MEDICAL IMAGE DATA
7y 4m to grant Granted Jul 02, 2024
16/249,110
Patent 11942189
Drug Efficacy Prediction for Treatment of Genetic Disease
5y 2m to grant Granted Mar 26, 2024
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

1-2
Expected OA Rounds
17%
Grant Probability
50%
With Interview (+33.0%)
3y 9m (~2y 2m remaining)
Median Time to Grant
Low
PTA Risk
Based on 132 resolved cases by this examiner. Grant probability derived from career allowance rate.