Office Action Analysis: 18664299 — MEDICAL IMAGE DIAGNOSTIC SYSTEM, OPERATION METHOD OF MEDICAL IMAGE DIAGNOSTIC SYSTEM, AND INFORMATION PROCESSING SYSTEM

Office Action

§103 §112
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 2, 3, 4, 5, and 13 are rejected to under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Regarding claim 2, claim 2 was written as follows, “wherein the first detection device is provided in a vicinity of the image diagnostic apparatus or the image diagnostic apparatus”. The use of image diagnostic apparatus twice appears to be a typo as that would only need to be included one time. For examination purposes, this portion of claim two was interpreted as “wherein the first detection device is provided in a vicinity of the image diagnostic apparatus,”
Claim 2 also refers to the first detection device as “provided in a vicinity of the image diagnostic apparatus”. “In vicinity” is indefinite language and how much distance from the image diagnostic constitutes “in vicinity” is not specified in the specifications. The applicant must specify what distance is appropriate when mentioning “in vicinity”.
Claims 3 and 13 are dependent on claim 2 and thus are rejected for the same reasons as stated above.
Claim 4 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 4 refers to “the second detection device is provided at a position farther away from the image diagnostic apparatus than the first detection device”. Saying “a position farther away… than the first detection device” is indefinite language and this “farther away” language must be specified in the claim as to how much farther away.
Claim 5 is dependent on claim 4 and thus is rejected for the same reasons as stated above.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claim(s) 1-5, 8-9, 16-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zuccolotto (hereinafter Zuccolotto) (US 20050283068 A1) in view of Su et al. (Hereinafter Su) (LipLearner: Customizing Silent Speech Commands from Voice Input using One-shot Lipreading).

Regarding claim 1, Zuccolotto teaches:
A medical image diagnostic system comprising (Zuccolotto, Abstract);
an image diagnostic apparatus that acquires a medical image (Zuccolotto, Abstract); 
a first detection device (Zuccolotto, P[0013], shows two device system (microphones))
a second detection device (Zuccolotto, P[0013], shows two device system (microphones))
a first display device that displays information in an aspect visible to the subject during the examination using the image diagnostic apparatus (Zuccolotto, Abstract)
a processor (Zuccolotto, P[0058], P[0061]); and 
a memory that stores a program to be executed by the processor (Zuccolotto, P[0058], P[0061]),
Kopp does not teach:
a first detection device that detects, with at least one of a video or a voice, utterance-related information related to an utterance of a subject during an examination of the subject using the image diagnostic apparatus
a second detection device that detects, with at least one of a video or a voice, response information of the subject to a question to the subject before the examination of the subject using the image diagnostic apparatus is started
wherein the processor
generates subject feature information related to voice generation of the subject based on the response information detected by the second detection device
recognizes an utterance content of the subject based on the utterance-related information detected by the first detection device and the subject feature informationHowever, Su teaches:
device that detects, with at least one of a video or a voice, utterance-related information related to an utterance of a subject during an examination of the subject using the image diagnostic apparatus (Su, Page 4: "we conducted a pilot experiment with 3 participants (2 male, 1 female), which took around 10 minutes. In the registration session, the participant was asked to read aloud the 10 sentences from the OuluVS corpus [12] as commands in random order. In the test session, the participant issued each command 5 times in a silent manner. Those commands are recognized in real time and the results are displayed at the top of screen." (reads on a first detection device (phone) that detects with at least one of a video or voice (phone camera and microphone) utterance-related information" ("testing" step where participant issues command silently (to rely only on camera, simulating noisy environment of MRI machine) using the utterance-related information gathered in the testing step)))
device that detects, with at least one of a video or a voice, response information of the subject to a question to the subject before the examination of the subject using the image diagnostic apparatus is started (Su, Page 4 (same quote as above): (reads on a first detection device (phone) that detects with at least one of a video or voice (phone camera and microphone) utterance-related information" ("training" step where the second detection device (phone) detects with both video (phone camera) and voice (phone microphone) responses to prompted sentences (reads on questions), Zuccolotto, now modified by Su, makes obvious before the examination of the subject using the MRI (image diagnostic apparatus))
wherein the processor 
generates subject feature information related to voice generation of the subject based on the response information detected by the second detection device (Su, Page 4, (the training/testing phase inherently involves gathering user-specific voice information from camera/microphone in order to be able to read silently spoken speech in the testing phase)) and 
recognizes an utterance content of the subject based on the utterance-related information detected by the first detection device and the subject feature information to cause the first display device to display the recognized utterance content (Su (Figure 1. part B, C), the utterance content is displayed and asks for confirmation on if it was correct (recognized utterance content) based on utterance-related information detected (user lip movement))
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kopp in view of Su. Doing so would have provided the with the silent speech (lip-based) utterance capture and subject-specific training framework of Su (Su, Abstract, Page 4) with the speech recognition system during a noisy MRI exam of Zuccolotto (Zuccolotto, Abstract), thus leading to an improved personalized, multi-modal speech recognition system that can leverage pre-collected subject feature information to improve the accuracy and robustness during a medical diagnostic scan.

Regarding claim 2, the combination of Zuccolotto and Su teaches the medical image diagnostic system according to claim 1.
Zuccolotto, in combination with Su, further teaches:
wherein the first detection device is provided in a vicinity of the image diagnostic apparatus (Zuccolotto, Abstract, reads on vicinity of MRI machine) and includes at least one of a first camera that captures a video including at least a lip part of a face region of the subject during the examination using the image diagnostic apparatus (Zuccolotto, P[0008], (microphone during MRI exam to cover patient’s speech for speech recognition), P[0018], (camera during MRI exam that covers movement in head area)) or a first microphone that detects a voice uttered by the subject 

Regarding claim 3, the combination of Zuccolotto and Su teaches the medical image diagnostic system according to claim 2.
Su in combination with Zuccolotto, further teaches:
wherein the utterance-related information is a lip movement of the subject acquired from the video captured by the first camera (Su, Page 2 Section 2.4, reads on capturing phone camera of lip movement)

Regarding claim 4, the combination of Zuccolotto and Su teaches the medical image diagnostic system according to claim 1.
Zuccolotto, in combination with Su, further teaches
and includes at least one of a second camera that captures a video including a face region of the subject, or a second microphone that detects a voice uttered by the subject (Zuccolotto, P[0008], (microphone during MRI exam to cover patient’s speech for speech recognition), P[0018], (camera during MRI exam that covers movement in head area)) wherein the second detection device is provided at a position farther away from the image diagnostic apparatus than the first detection device (Zuccolotto, P[0021], “In the present invention noise cancellation microphone system there is no problem with differential microphone placement” (One microphone device in the system can be in a different placement (including further away from MRI machine) than another microphone device)

Regarding claim 5, the combination of Zuccolotto and Su teaches the medical image diagnostic system according to claim 4.
Su, in combination with Zuccolotto further teaches
wherein the response information is a lip movement of the subject acquired from the video captured by the second camera (Su, Page 2 Section 2.4, reads on capturing phone camera of lip movement).

Regarding claim 8, the combination of Zuccolotto and Su teaches the medical image diagnostic system according to claim 1.
Su, in combination with Zuccolotto, further teaches:
wherein the processor 
trains a first machine learning model dedicated to the subject (Su, Page 3: “For each user, we train an SVM classifer on their own data to maximize the accuracy (the training time is approximately 0.1 second”) based on the response information detected by the second detection device (Su, Page 3: “During the utterance, the user keeps holding down the record button to record the lip movements using the front camera and the voice using the microphone.” (this lip movement is the response information from detection device)), and 
inputs the utterance-related information detected by the first detection device to the trained first machine learning model, to acquire the utterance content recognized by the first machine learning model (Su, Page 4: “In the test session, the participant issued each command 5 times in a silent manner.” (test session is inputted into trained machine learning model for recognition (test session represents first device in terms of function)), Su, Page 3: “allowing instant fine-tuning for unseen users and words using one-shot learning.” (trained model is used for classification), Su, Page 4: “With the text as the label, the model can recognize silent speech input into commands” (trained model outputs recognized command content that corresponds to acquiring recognized utterance content)).  

Regarding claim 9, the combination of Zuccolotto and Su teaches the medical image diagnostic system according to claim 8.
Su, in combination with Zuccolotto, further teaches:
 wherein the subject feature information includes parameters optimized (Su, Page 3: “The model takes an input of 29 frames center-cropped to 88 × 88 pixels and outputs a 500-dimensional vector.” (500-dimensional vector is feature information extracted form subject’s lip movements), Su Page 3: “instant fine-tuning for unseen users and words using one-shot learning.” (reads on optimizing model parameters) in a process of training the first machine learning model based on response information of the subject (Su, Page 3: “During the utterance, the user keeps holding down the record button to record the lip movements using the front camera and the voice using the microphone.” (parameters are optimized based on data captured from subject (lip movement and voice))).  

Regarding claim 16, the combination of Zuccolotto and Su teaches the medical image diagnostic system according to claim 1.
Zuccolotto, in combination with Su, further teaches:
wherein the first display device is a projector that performs projection onto a screen visible to the subject, a head-up display, a head-mounted display, a liquid crystal display, or an organic EL display (Zuccolotto, P[0011], projected video).  

Regarding claim 17, the combination of Zuccolotto and Su teaches the medical image diagnostic system according to claim 1.
Zuccolotto, in combination with Su, further teaches
wherein the image diagnostic apparatus includes a magnetic resonance imaging apparatus, an X-ray CT apparatus, a PET apparatus, a radiation therapy apparatus, or a particle beam therapy apparatus (Zuccolotto, Abstract, reads on MRI apparatus).  

Regarding claim 18, claim 18 recites the operation method corresponding to the medical diagnostic system presented in claim 1 and is rejected under the same grounds as above.

Regarding claim 19, claim 19 recites the information processing system corresponding to the medical diagnostic system presented in claim 8 and is rejected under the same grounds as above.

Regarding claim 20, claim 20 recites the information processing system corresponding to the medical diagnostic system presented in claim 1 and is rejected under the same grounds as above.
	Claim(s) 6-7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zuccolotto (hereinafter Zuccolotto) (US 20050283068 A1) in view of Su et al. (Hereinafter Su) (LipLearner: Customizing Silent Speech Commands from Voice Input using One-shot Lipreading) and in further view of Shah et al. (hereinafter Shah) (US 8160683 B2).

Regarding claim 6, the combination of Zuccolotto and Su teaches the medical image diagnostic system according to claim 1.
The combination of Zuccolotto and Su does not teach
further comprising: 
a question apparatus that asks a question to the subject, 
wherein the question apparatus asks a question of a predetermined format with at least one of a voice or characters on a monitor screen
However, Shah teaches:
further comprising: 
a question apparatus that asks a question to the subject, 
wherein the question apparatus asks a question of a predetermined format with at least one of a voice or characters on a monitor screen (Shah, P[36]: “may include prompting a user (e.g., the caregiver) to confirm the new patient information was correctly determined “, “user may confirm the command by speaking the word "yes" or the word "execute" in response to the displayed command.” (reads on answering predetermined question regarding patient information given via a monitor screen)).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Zuccolotto in view of Su and Shah. Doing so would have provided the display-prompting of Shah (Shah, Abstract, P[36]) with the the silent speech (lip-based) utterance capture and subject-specific training framework of Su (Su, Abstract, Page 4) and the speech recognition system during a noisy MRI exam of Zuccolotto (Zuccolotto, Abstract), thus leading to an improved personalized, multi-modal speech recognition system that can leverage pre-collected subject feature information with real time user feedback to improve the accuracy and robustness of speech recognition and patient comfort during a medical diagnostic scan.
 
Regarding claim 7, the combination of Zuccolotto, Su, and Shah teaches the medical image diagnostic system according to claim 6.
Shah, in combination with Zuccolotto and Su, further teaches:
wherein the question of the predetermined format includes a question content for inducing an utterance having a possibility of being uttered by the subject during the examination, as an answer, or causing the subject to read aloud the utterance (Shah, P[36]: “may include prompting a user (e.g., the caregiver) to confirm the new patient information was correctly determined “, “user may confirm the command by speaking the word "yes" or the word "execute" in response to the displayed command.” (new patient information can include health information that may be referred to by patient while in examination)).

Claim(s) 10-12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zuccolotto (hereinafter Zuccolotto) (US 20050283068 A1) in view of Su et al. (Hereinafter Su) (LipLearner: Customizing Silent Speech Commands from Voice Input using One-shot Lipreading) and in further view of Takanayagi et al. (hereinafter Takanayagi) (US 20170309275 A1).

Regarding claim 10, the combination of Zuccolotto and Su teaches the medical image diagnostic system according to claim 1.
The combination of Zuccolotto and Su does not teach:
wherein a second machine learning model that has been trained through machine learning in advance based on a training data set consisting of utterance-related information related to utterances of a plurality of people is provided and 
the processor inputs the utterance-related information detected by the second detection device to the second machine learning model, to acquire the utterance content recognized by the second machine learning model in a case in which the utterance content recognized by the first machine learning model is not a meaningful content or a certainty degree of the utterance content is less than a threshold value
However, Takanayagi teaches:
wherein a second machine learning model that has been trained through machine learning in advance based on a training data set consisting of utterance-related information related to utterances of a plurality of people is provided (Takanayagi, P[0004]: “However, in subject independent (SI) lip reading services, errors can occur due to large variations within lip shapes, skin textures around the mouth, varying speaking speeds and different accents, which could significantly affect the spatiotemporal appearances of a speaking mouth. A recent SI lip reading algorithm developed by Zhou et al. can reportedly achieve recognition rates as high as 92.8%” (Subject Independent models are trained on a plurality of people to handle large variations within lip shapes and skin textures across a great population)), and 
the processor inputs the utterance-related information detected by the second detection device to the second machine learning model, to acquire the utterance content recognized by the second machine learning model in a case in which the utterance content recognized by the first machine learning model is not a meaningful content or a certainty degree of the utterance content is less than a threshold value Takanayagi, P[0109]-P[0110]: “if no prototype signal has a probability greater than a predetermined standard such as, for example, 90%, the portion of the video stream corresponding to this portion of the audio stream (the previous Y time units) is input to the video based recognition module”,  “the decision to proceed to 524 could be decided based upon a combination of if the word is a characteristic word and the probability of the prototype signal” (hierarchical logic is described here where the system checks the “probability” (certainty degree) of a result against a “predetermined standard” (threshold). If the result is not suitable, the system proceeds to another recognition module/model. Using a general (SI) model as a fallback for a low-probability specialized (SD) result is a standard application of the logic here.)).  
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Zuccolotto in view of Su and Takanayagi. Doing so would have provided the two machine learning model of Takanayagi (Takanayagi, Abstract, P[0109]-[0110]) with the the silent speech (lip-based) utterance capture and subject-specific training framework of Su (Su, Abstract, Page 4) and the speech recognition system during a noisy MRI exam of Zuccolotto (Zuccolotto, Abstract), thus leading to an improved personalized, two modal (video and audio) and two machine learning model speech recognition system that can leverage pre-collected subject feature information to improve the accuracy and robustness of speech recognition and patient comfort during a medical diagnostic scan.


Regarding claim 11, the combination of Zuccolotto and Su teaches the medical image diagnostic system according to claim 1.
Zuccolotto, in combination with Su, further teaches:in an operation room (Zuccolotto, Fig. 1, P[0009], MRI environment reads on operation room).
The combination of Zuccolotto and Su does not teach:
further comprising: 
a notification device that notifies an operator in an operation room of the image diagnostic apparatus of the utterance content, wherein the processor outputs the utterance content to the notification device
However, Takanayagi teaches:
further comprising: 
a notification device that notifies an operator in an operation room of the image diagnostic apparatus of the utterance content, wherein the processor outputs the utterance content to the notification device (Takanayagi, P[0012]: “sending the first data packets to the server end device (a remote apparatus)”, P[0068]: “server 206 is connected to or includes one or more modules 208, 210 for performing audio and video based speech recognition.” (Distributed architecture here involves a local “dictation device” (subject-side) sending data to a “remote apparatus” (operator-side). In a medical diagnostic suite, the “remote apparatus” is the technician’s workstation in the operation room, which receives and outputs the processed utterance content. 

Regarding claim 12, the combination of Zuccolotto, Su, and Takanayagi teaches the medical image diagnostic system according to claim 11.
Takanayagi, in combination with Zuccolotto and Su, further teaches:
wherein the notification device is at least one of a second display device that displays characters indicating the utterance content or a speaker that generates a voice indicating the utterance content (Takanayagi, P[0012]: “The controller can be further be configured to render the combined dictation as text on a display.”, P[0096]: “render the combined dictation as text on the display 320 and/or send the second data packets as input to a downstream application such as an Internet website or a control command to other devices such as a television” (the text on a “display” (characters) is explicitly defined alongside outputting data to “other hardware” (such as a TV/Speaker system). In a medical setting, using these standard output methods (monitors/speakers) to inform an operator is a routine implementation.))

Claim(s) 13 is rejected under 35 U.S.C. 103 as being unpatentable over Zuccolotto (hereinafter Zuccolotto) (US 20050283068 A1) in view of Su et al. (Hereinafter Su) (LipLearner: Customizing Silent Speech Commands from Voice Input using One-shot Lipreading) and in further view of Miller et al. (hereinafter Miller) (US 20060074286 A1).

Regarding claim 13, the combination of Zuccolotto and Su teaches the medical image diagnostic system according to claim 3.
The combination of Zuccolotto and Su does not teach:
wherein the processor 
determines whether or not the lip movement of the subject hinders the examination of the subject using the image diagnostic apparatus in a case in which next utterance-related information is not detected for a certain time or longer after the first detection device detects the utterance-related information, and causes the first display device to display characters prompting the subject to make an utterance in a case in which it is determined that the lip movement of the subject does not hinder the examination of the subject using the image diagnostic apparatus
However, Miller teaches:
wherein the processor 
determines whether or not the lip movement of the subject hinders the examination of the subject using the image diagnostic apparatus in a case in which next utterance-related information is not detected for a certain time or longer after the first detection device detects the utterance-related information, and causes the first display device to display characters prompting the subject to make an utterance in a case in which it is determined that the lip movement of the subject does not hinder the examination of the subject using the image diagnostic apparatus (Miller, Abstract: "prompt the patient, using the patient display, to perform a bodily action that facilitates the scan", P[0028]: "In accordance with various embodiments of the invention, the promptings are based on the actions of the patient. The patient's actions are monitored. Based on these monitored actions" (Monitoring patient actions (includes lip movement/speech) and determining if those actions hinder the scan via artifacts is explicitly taught. “Timeout” logic is also used here where, if a specific action (or lack thereof) is detected for a certain time the system triggers the next step. The specific hardware (patient display) and the logic of “timing” the prompt so it only occurs when it “facilitates the scan” (ex. When it doesn’t hinder it) is taught. Breath holding mentioned in this reference can be obviously applied to and reads on making an utterance). 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Zuccolotto in view of Su and Miller. Doing so would have provided the logic-based system with timed prompting of Miller (Miller, Abstract) with the silent speech (lip-based) utterance capture and subject-specific training framework of Su (Su, Abstract, Page 4) and the speech recognition system during a noisy MRI exam of Zuccolotto (Zuccolotto, Abstract), thus leading to an improved personalized, two modal (video and audio) speech recognition system that can leverage pre-collected subject feature information with in-exam utterance monitoring to improve the accuracy and robustness of speech recognition and patient comfort/safety during a medical diagnostic scan.

Claim(s) 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over Zuccolotto (hereinafter Zuccolotto) (US 20050283068 A1) in view of Su et al. (Hereinafter Su) (LipLearner: Customizing Silent Speech Commands from Voice Input using One-shot Lipreading) and in further view of Sayadi et al. (hereinafter Sayadi) (US 11931168 B2).

Regarding claim 14, the combination of Zuccolotto and Su teaches the medical image diagnostic system according to claim 1.
Zuccolotto, in combination with Su, further teaches:
first display device to display the reply sentence (Zuccolotto, P[0039]: “The video, or scene or image viewed by the patient, may include material such as instructions”).  
during the examination of the subject using the image diagnostic apparatus (Zuccolotto, Abstract, P[0008], MRI reads on image diagnostic apparatus, “patient during scanning” reads on during examination of subject)
The combination of Zuccolotto and Su does not teach 
further comprising: 
a vital information measurement device that measures vital information of the subject
wherein the processor determines whether or not a reply to the utterance content is necessary, based on the utterance content and the measured vital information, and creates a reply sentence corresponding to the utterance content to cause the first display device to display the reply sentence in a case in which it is determined that the reply is necessary
However, Sayadi teaches
further comprising: 
a vital information measurement device that measures vital information of the subject during the examination of the subject using the image diagnostic apparatus (Sayadi, P[13], describes all vital information that is recorded), 
wherein the processor determines whether or not a reply to the utterance content is necessary, based on the utterance content and the measured vital information, and creates a reply sentence corresponding to the utterance content to cause the first display device to display the reply sentence in a case in which it is determined that the reply is necessary (Sayadi, P[12]: “and a processor configured to record the subject's audio and biosignals, process them, detect the subject's speech, process the subject's speech, and initiate a response to the subject's speech.” (describes response based on measured vital information and create a reply for subject.))
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Zuccolotto in view of Su and Sayadi. Doing so would have provided the vitals’ diagnostic response system of Sayadi (Sayadi, Abstract, P[12]-[13]) with the silent speech (lip-based) utterance capture and subject-specific training framework of Su (Su, Abstract, Page 4) and the speech recognition system during a noisy MRI exam of Zuccolotto (Zuccolotto, Abstract), thus leading to an improved personalized, two modal (video and audio) speech recognition system that can leverage pre-collected subject feature information with in-exam utterance monitoring to improve the accuracy and robustness of speech recognition and patient comfort/safety during a medical diagnostic scan.

Regarding claim 15, the combination of Zuccolotto, Su, and Sayadi teaches the medical image diagnostic system according to claim 14.
Sayadi, in combination with Zuccolotto and Su, further teaches:
wherein the vital information measurement device measures one or more of a heart rate, a blood pressure, a respiratory rate, a body temperature, an electrocardiogram, or a blood oxygen saturation concentration of the subject (Sayadi, P[0013], P[0017], mentions heart rate P[13], blood pressure P[17], respiratory rate P[13], temperature P[17], blood oxygen P[17])).  

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHASHIDHAR S MANOHARAN whose telephone number is (571)272-6772. The examiner can normally be reached M-F 8:00-4:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached at 571-272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SHASHIDHAR SHANKAR MANOHARAN/Examiner, Art Unit 2655                                                                                                                                                                                                        
/ANDREW C FLANDERS/Supervisory Patent Examiner, Art Unit 2655
Read full office action
MEDICAL IMAGE DIAGNOSTIC SYSTEM, OPERATION METHOD OF MEDICAL IMAGE DIAGNOSTIC SYSTEM, AND INFORMATION PROCESSING SYSTEM

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

MEDICAL IMAGE DIAGNOSTIC SYSTEM, OPERATION METHOD OF MEDICAL IMAGE DIAGNOSTIC SYSTEM, AND INFORMATION PROCESSING SYSTEM

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email