Office Action Analysis: 18435955 — ELECTRONIC DEVICE WITH DICTATION STRUCTURE

Office Action

§102 §103
DETAILED ACTION
This communication is in response to the Application filed on 02/07/2024. Claims 1-20 are pending and have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The IDS dated 02/07/2024 has been considered and placed in the application file.  

Claim Interpretation
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification.
The following terms in the claims have been given the following interpretations in light of the specification:
Facial interface: ¶ [0048], “As used herein, the term "facial interface" refers to a portion of the head-mountable device 100 that engages a user face via direct contact.”  
Thus, a facial interface is any component or portion of a head-mountable device that contacts a user’s face.  This definition is used for purposes of searching for prior art, but cannot be incorporated into the claims.
Motion sensor: ¶ [0081], “The second sensor and/or the third sensor can be motion sensors disposed in proximity to the zygoma regions 109 or the maxilla regions 111 of the face of the user. … For example, the motion sensors (the second and third sensors) can be pressure sensors or strain gauges.”  
Thus, a motion sensor is any sensor that senses motion. The motion sensor can be a pressure sensor or a strain gauge.  This definition is used for purposes of searching for prior art, but cannot be incorporated into the claims.
Partial view of the mouth: ¶ [0108], “In some examples, the visual data can include different orientations or angles of a field of view that at least partially includes a user's mouth (e.g., a profile view from a user-facing device with a full field of view of the user's mouth, a downward angled view from a jaw camera with a partial field of view of the user's mouth, etc.).”  
Thus, a partial view of the mouth can comprise a downward angled view from camera positioned on the bottom of a head-mountable device.  This definition is used for purposes of searching for prior art, but cannot be incorporated into the claims.
Should applicant wish different definitions, Applicant should point to the portions of the specification that clearly show a different definition.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention. 

Claims 1, 16, 19, and 20 are rejected under 35 U.S.C. 102(a)(1) and (a)(2) as being anticipated by US Patent Publication 20190019508 A1, (Rochford et al.).
Claim 1
Regarding claim 1, Rochford et al. disclose a head-mountable device, comprising:
a display frame disposed around the display (Rochford et al. ¶ [0056], "In certain embodiments, the head mounted display 310 is worn on the head of a user or part of a helmet similar to HMD 116 of FIG. 1" See Fig. 4, which illustrates a display frame that carries the display and other components);
a vision sensor carried by the display frame and oriented externally in a downward direction that, when the head-mountable device is donned on a head of a user, is configured to detect mouth movement (Rochford et al. ¶ [0101], "Lip tracking sensor 420 is affixed to the head mounted display 405 and positioned to capture various movements of the user's lips. Lip tracking sensor 420 is configured similar to lip movement detector 270 of FIG. 2 and can include mouth camera 314 of FIG. 3." See Fig. 4, which illustrates downward-facing vision camera 420);
a processor (Rochford et al. ¶ [0062], "Control unit 320 can be ... part of the head mounted display 310. The control unit 320 includes lip movement processor 322, eye focus processor 324, natural language processor 325,");
and a memory device storing instructions (Rochford et al. ¶ [0048], "The memory 260 can include persistent storage (not shown) that represents any structure(s) capable of storing and facilitating retrieval of information (such as data, program code, or other suitable information on a temporary or permanent basis).") that, when executed by the processor, cause the processor to convert visual data of the mouth movement to a text input (Rochford et al. ¶ [0063], "Lip movement processor 322 can track the motion of the lips, mouth, tongue, or a combination thereof. Lip movement processor 322 derives words and phrases based on the shape and movement of the user's mouth.").
Claim 16
Regarding claim 16, Rochford et al. disclose a wearable apparatus, comprising:
a display housing (Rochford et al. ¶ [0056], "Head mounted display 310 is an electronic device that can display content, such as text, images and video, through a GUI, such as display 312." Head mounted display 310 is considered analogous to a display housing. See Fig. 4A (Head mounted display 310 is labelled 405)) including an optical dictation sensor (Rochford et al. ¶ [0058], "the mouth camera 314 can be on the external surface of the head mounted display 310." Mouth camera is considered analogous to an optical dictation sensor);
a display positioned within the display housing (Rochford et al. ¶ [0056], "Head mounted display 310 is an electronic device that can display content, such as text, images and video, through a GUI, such as display 312."); and
a facial interface connected to the display housing  (Rochford et al. ¶ [0056], "the head mounted display 310 is worn on the head of a user or part of a helmet similar to HMD 116 of FIG. 1." The head mounted display is considered analogous to a facial interface; see claim interpretation section), the facial interface including a motion sensor (Rochford et al. ¶ [0035], "HMD 116 can include multiple camera sensors or motion sensors to record and track various movements of the user."), the motion sensor and the optical dictation sensor communicatively coupled to a processor (Rochford et al. ¶ [0045], "The processor 240 is also coupled to the input 250 and the display 255. ... Input 250 can be associated with lip movement detector 270 and eye focus detector 275. Input 250 can include one or more cameras for eye and lip movement detection").


Claim 19
Regarding claim 19, the rejection of claim 16 is incorporated. Rochford et al. further disclose wherein the optical dictation sensor includes a pair of vision sensors positioned within the display housing (Rochford et al. ¶ [0056]-[0058], "Head mounted display 310 includes ... mouth camera 314 ... In certain embodiments, mouth camera 314 includes two or more cameras." Head mounted display 310 is considered analogous to display housing. See claim 16).
Claim 20
Regarding claim 20, the rejection of claim 16 is incorporated. Rochford et al. further disclose a memory device and the processor, the memory device comprising instructions (Rochford et al. ¶ [0048], "The memory 260 can include persistent storage (not shown) that represents any structure(s) capable of storing and facilitating retrieval of information (such as data, program code, or other suitable information on a temporary or permanent basis).") that, when executed by the processor, cause the processor to activate a silent dictation mode in response to detecting a user input to dictate  (Rochford et al. ¶ [0058]-[0063], "In certain embodiments, mouth camera 314 continually monitors the user's mouth for movement. Once movement is detected, the movement is transmitted to lip movement processor 322, of the control unit 320. … Thereafter, the lip movement processor 322 derives a command given by the user based on the user's mouth movements. The command generated by the lip movement processor 322 is referred to as a derived command, as the command is derived based on the mouth, lip, and tongue movement of the user." Detecting a user's lip movement is considered analogous to detecting a user input to dictate).

Claims 6-9, 11-13, and 15 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by US Patent Publication 20250225995 A1, (von Liechtenstein).
Claim 6
Regarding claim 6, von Liechtenstein disclose a system, comprising:
a wearable device (von Liechtenstein ¶ [0048], "A user 100 is wearing smart glasses and wireless ear buds." Smart glasses (i.e. head-mountable device) are considered analogous to a wearable device) communicatively coupled (von Liechtenstein ¶ [0042], "The various devices may be interconnected via a wireless connection in order to make up the composite apparatus.") to an electronic device comprising a first sensor (von Liechtenstein ¶ [0042], "The user's face, in particular the area around the mouth 105 is observed by cameras 103 104 which may be hosted on a handheld device 102 or wrist-attached device 101. ...  The various devices may be interconnected via a wireless connection in order to make up the composite apparatus. " Handheld device is considered analogous to an electronic device comprising a first sensor. User-facing cameras 103 and 104 are considered analogous to the first sensor), the wearable device comprising:
a second sensor (von Liechtenstein ¶ [0048], "The temple tips 201 may have an integrated bone conducting transducer 202 which is pressed against the user's skin by pressure exerted through the frame of the head mountable device, of which the temple tips may be an integral component." Bone conducting transducer 202 is considered analogous to a second sensor); 
a processor (von Liechtenstein ¶ [0057], "The head mountable device may comprise... a computing unit 321"); 
and a memory device storing instructions (von Liechtenstein ¶ [0027], "aspects of the present inventions may take the form of a computer program product embodied in one or more computer-readable medium(s) having computer readable program code/instructions embodied thereon") that, when executed by the processor, cause the processor to: 
identify sensor data from the first sensor (von Liechtenstein ¶ [0059], "A lip-reading processor may take a video stream of the user's mouth area, or general face area, as input") and the second sensor (von Liechtenstein ¶ [0048], "The temple tips may also comprise a user-facing laser microphone 221 configured to pick up telltale oscillations which may be associated with silent speech from the skin around the temple area of the user's skull."); 
generate a predicted dictation based on the sensor data (von Liechtenstein ¶ [0059], "A lip-reading processor may take a video stream of the user's mouth area, or general face area, as input and transcribe speech, including silent speech."); and 
present, for display at the wearable device, a graphical representation of the predicted dictation (von Liechtenstein ¶ [0094], "In another use case, an embodiment may display the resulting text from the speech recognition step 609 visually. In a preferred embodiment, the speech uttered by a speaker who is subject to a gaze dwell is displayed as subtitles 611 on a display which is integrated with the head mountable device.").
Claim 7
Regarding claim 7, the rejection of claim 6 is incorporated. von Liechtenstein further disclose wherein:
the first sensor is a first type of sensor  (von Liechtenstein ¶ [0042], "The user's face, in particular the area around the mouth 105 is observed by cameras 103 104 which may be hosted on a handheld device 102 or wrist-attached device 101."); and
the second sensor is a second type of sensor different from the first type of sensor (von Liechtenstein ¶ [0048], "The temple tips 201 may have an integrated bone conducting transducer 202 which is pressed against the user's skin by pressure exerted through the frame of the head mountable device, of which the temple tips may be an integral component.").
Claim 8
Regarding claim 8, the rejection of claim 7 is incorporated. von Liechtenstein further disclose wherein:
the first type of sensor comprises a camera (von Liechtenstein ¶ [0042], "The user's face, in particular the area around the mouth 105 is observed by cameras 103 104 which may be hosted on a handheld device 102 or wrist-attached device 101."); and
the second type of sensor comprises at least one of an acoustic sensor, a pressure sensor, a strain gauge, a vibration detector, a breath detector, or a biometric sensor (von Liechtenstein ¶ [0048], "The temple tips 201 may have an integrated bone conducting transducer 202 which is pressed against the user's skin by pressure exerted through the frame of the head mountable device, of which the temple tips may be an integral component." A bone conducting transducer is considered analogous to a vibration detector).
Claim 9
Regarding claim 9, the rejection of claim 6 is incorporated. von Liechtenstein further disclose wherein:
the first sensor is oriented in a first orientation (von Liechtenstein ¶ [0042], "The user's face, in particular the area around the mouth 105 is observed by cameras 103 104 which may be hosted on a handheld device 102 or wrist-attached device 101." The first sensor's orientation faces the user from a handheld device); and
the second sensor is oriented in a second orientation different from the first orientation (von Liechtenstein ¶ [0048], "The temple tips 201 may have an integrated bone conducting transducer 202 which is pressed against the user's skin by pressure exerted through the frame of the head mountable device, of which the temple tips may be an integral component." The second sensor is oriented towards a user's temple area from the frame of a head mountable device. This orientation is different from the first sensor's orientation).
Claim 11
Regarding claim 11, the rejection of claim 6 is incorporated. von Liechtenstein further disclose wherein the electronic device comprises an external client device (von Liechtenstein ¶ [0042], "The user's face, in particular the area around the mouth 105 is observed by cameras 103 104 which may be hosted on a handheld device 102 or wrist-attached device 101. ...  The various devices may be interconnected via a wireless connection in order to make up the composite apparatus. " Handheld device is considered analogous to an external client device).
Claim 12
Regarding claim 12, the rejection of claim 6 is incorporated. von Liechtenstein further disclose wherein generating the predicted dictation comprises utilizing contextual awareness (von Liechtenstein ¶ [0069], "One use case for a head mountable embodiment is the touchless entry of a PIN number into a point of sale (POS) terminal 539. ...  the eye tracker 568 in combination with the forward-facing camera 533, determines that the user has made eye contact with the POS terminal 539. ... Once the POS terminal is addressable, either through transmitting modulated light to it, or alternatively by addressing it through an Internet-centric communication protocol, then embodiments may transmit the required PIN number. ... Once the silent speech processor has detected a PIN number silent speech input it is transmitted to the POS terminal which has been detected in the first step." The HMD utilizes the contextual awareness of a POS to expect, dictate, and transmit a PIN number using silent speech).
Claim 13
Regarding claim 13, the rejection of claim 12 is incorporated. von Liechtenstein further disclose wherein the contextual awareness comprises user activity (von Liechtenstein ¶ [0069], "The described use case starts with a customer in shop approaching the POS terminal and making eye contact with it. With eye contact being established, an interfacing between the user and the POS terminal is then automatically initiated." Making eye contact with the POS terminal is considered analogous to user activity).
Claim 15
Regarding claim 15, the rejection of claim 6 is incorporated. von Liechtenstein further disclose wherein generating the predicted dictation includes using a machine-learning model  (von Liechtenstein ¶ [0059]-[0060], "An output of the lip-reading processor may be a stream of recognized visemes, wherein the recognized visemes may be an input to the phoneme disambiguation step during subsequent speech recognition by the speech recognition processor 254. ... The described embodiment [of the speech recognition processor] implemented the MLPerf version of RNNT, a large data-center network, on Librispeech without any additional hardware-aware retraining." RNNT is considered analogous to a machine-learning model).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2-4, 17, 18 are rejected under 35 U.S.C. 103 as obvious over US Patent Publication 20190019508 A1, (Rochford et al.) in view of "Facial Performance Sensing Head-Mounted Display" (Li et al.).
Claim 2
Regarding claim 2, the rejection of claim 1 is incorporated. Rochford et al. disclose all the elements of the claimed invention as stated above.
Rochford et al. do not explicitly disclose all of an additional sensor configured to detect facial vibrations or deformations.
However, Li et al. disclose an additional sensor configured to detect at least one of a facial vibration or a facial deformation (Li et al. pg. 3, Section 3.1, Paragraph 1, "we place eight strain gauges on the foam liner of the headset (Figure 1)." A strain gauge is considered analogous to an additional sensor). 
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify Rochford et al.’s dictation-enabled HMD to incorporate Li et al.’s additional sensor.
The suggestion/motivation for doing so would have been that, “Compared to other sensors, direct measurements of surface strain have low latency, do not suffer from complex muscular crosstalk, are suitable for nonverbal communication, and can accurately reproduce the linearity of human facial expressions.” as noted by Li et al. in pg. 2, Section 1, Paragraph 5.
Claim 3
Regarding claim 3, the rejection of claim 2 is incorporated. Rochford et al. in view of Li et al. disclose all the elements of the claimed invention as stated above. Li et al. further discloses wherein the additional sensor is positioned in direct contact with a face of the user (Li et al. pg. 3, Section 3.1, Paragraph 1, "we place eight strain gauges on the foam liner of the headset (Figure 1)." See figure 1, which illustrates that strain sensors will directly contact a user's face).
Claim 4
Regarding claim 4, the rejection of claim 2 is incorporated. Rochford et al. in view of Li et al.  disclose all the elements of the claimed invention as stated above. Rochford et al. further disclose wherein the processor activates a silent text input mode in response to receiving sensor data  (Rochford et al. ¶ [0058]-[0063], "In certain embodiments, mouth camera 314 continually monitors the user's mouth for movement. Once movement is detected, the movement is transmitted to lip movement processor 322, of the control unit 320. … Thereafter, the lip movement processor 322 derives a command given by the user based on the user's mouth movements. The command generated by the lip movement processor 322 is referred to as a derived command, as the command is derived based on the mouth, lip, and tongue movement of the user.") [from the additional sensor]. Li et al. discloses the additional sensor; see claim 2 rejection.
Claim 17
Regarding claim 17, the rejection of claim 16 is incorporated. Rochford et al. disclose all the elements of the claimed invention as stated above. 
Rochford et al. do not explicitly disclose all of a motion sensor disposed in proximity to a zygoma region or maxilla region of a user’s face.
However, Li et al. disclose wherein the motion sensor is disposed in proximity to a zygoma region or a maxilla region of a face of a user when donned (Li et al. pg. 3, Section 3.1, Paragraph 1, "we place eight strain gauges on the foam liner of the headset (Figure 1)." See Figure 1, which shows an HMD with strain sensors located around the maxilla region of a user's face. See claim interpretation section).
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify Rochford et al.’s dictation-enabled HMD to incorporate Li et al.’s motion sensor.
The suggestion/motivation for doing so is similar to the suggestion/motivation described above with respect to claim 2.
Claim 18
Regarding claim 18, the rejection of claim 16 is incorporated. Rochford et al. disclose all the elements of the claimed invention as stated above. 
Rochford et al. do not explicitly disclose all of a motion sensor disposed in proximity to a zygoma region or maxilla region of a user’s face.
However, Li et al. disclose wherein the motion sensor comprises at least one of a pressure sensor or a strain gauge (Li et al. pg. 7, Section 6, Paragraph 4, "the strain signals can capture some of the larger mouth motions such as smile or mouth open" See claim interpretation section).
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify Rochford et al.’s dictation-enabled HMD to incorporate Li et al.’s motion sensor.
The suggestion/motivation for doing so is similar to the suggestion/motivation described above with respect to claim 2.

Claim 5 is rejected under 35 U.S.C. 103 as obvious over US Patent Publication 20190019508 A1, (Rochford et al.) in view of US Patent Publication 20200226814 A1, (Tang et al.).
Claim 5
Regarding claim 5, the rejection of claim 1 is incorporated. Rochford et al. disclose all the elements of the claimed invention as stated above. 
Rochford et al. further disclose a second sensor including an internal-facing camera  (Rochford et al. ¶ [0059], "the eye camera 316 can be on an internal surface on the head mounted display 310 positioned to view the user's eyes.") to detect an input selection based on eye gaze (Rochford et al. ¶ [0059], "eye camera 316 is a singular camera and apparatus to detect a eye focus of a user.").
Rochford et al. do not explicitly disclose all of a third sensor.
However, Tang et al. disclose a third sensor including an external-facing camera (Tang et al. ¶ [0017]-[0018], "The head-mounted display device 10 may further include ... a depth imaging device 22 ... Depth imaging device 22 may include an infrared light-based depth camera (also referred to as an infrared light camera)" See Fig. 1, which illustrates depth imaging device 22 as an external-facing camera) to detect a hand gesture indicating confirmation of the input selection (Tang et al. ¶ [0048], "once a virtual object is targeted, the head-mounted display may recognize a selection gesture from the user's hand based on information received from the depth camera, and select the targeted object responsive to recognizing the selection gesture."). 
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify Rochford et al.’s dictation-enabled HMD to incorporate Tang et al.’s gesture sensor.
The suggestion/motivation for doing so would have been that, “casting directly from the hand may allow for more intuitive targeting and fine control. By using the palm, the user also retains the mental and physical freedom to manipulating targeted objects with their fingers,” as noted by the Tang et al. disclosure in ¶ [0026].

Claim 10 is rejected under 35 U.S.C. 103 as obvious over US Patent Publication 20250225995 A1, (von Liechtenstein) in view of US Patent Publication 20190019508 A1, (Rochford et al.).
Claim 10
Regarding claim 10, the rejection of claim 9 is incorporated. von Liechtenstein disclose all the elements of the claimed invention as stated above. von Liechtenstein further disclose wherein:
in the first orientation, the first sensor has a full view of a mouth of a user (von Liechtenstein ¶ [0042], "The user's face, in particular the area around the mouth 105 is observed by cameras 103 104 which may be hosted on a handheld device 102 or wrist-attached device 101." The first sensor's orientation faces the user from a handheld device).
von Liechtenstein does not explicitly disclose all of the second sensor having a partial view of the mouth.
However, Rochford et al. disclose wherein in the second orientation, the second sensor has a partial view of the mouth of the user (Rochford et al. ¶ [0101], "Lip tracking sensor 420 is affixed to the head mounted display 405 and positioned to capture various movements of the user's lips." See Fig. 4, which illustrates downward-facing camera 420 affixed to the HMD. Downward-facing camera 420 is considered analogous to a second sensor with partial view of the user's mouth; see claim interpretation section).
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify von Liechtenstein’s dictation-enabled devices to include Rochford et al.’s downward-facing camera affixed to a HMD because such a modification is the result of simple substitution of one known element for another producing a predictable result.  More specifically, von Liechtenstein’s bone-conducting transducer and Rochford et al.’s downward-facing camera affixed to a HMD perform the same general and predictable function, the predictable function being providing sensor data to enable dictating a user’s silent speech. Since each individual element and its function are shown in the prior art, albeit shown in separate references, the difference between the claimed subject matter and the prior art rests not on any individual element or function but in the very combination itself - that is in the substitution of von Liechtenstein’s bone-conducting transducer by replacing it with Rochford et al.’s downward-facing camera affixed to a HMD. Thus, the simple substitution of one known element for another producing a predictable result renders the claim obvious.

Claim 14 is rejected under 35 U.S.C. 103 as obvious over US Patent Publication 20250225995 A1, (von Liechtenstein) in view of US Patent Publication 20220066207 A1 (Croxford et al.) in view of US Patent Publication 20220059231 A1 (Saleh).
Claim 14
Regarding claim 14, the rejection of claim 6 is incorporated. von Liechtenstein disclose all the elements of the claimed invention as stated above. von Liechtenstein further disclose wherein the memory device further comprises instructions that, when executed by the processor, cause the processor to activate a silent dictation mode (von Liechtenstein ¶ [0092], "The viseme recognition step 607, in turn is preceded by step for picking up lip movements of the speaker who is being identified by a face being visually detected in direction of the bearing of a gaze dwell, whereas the gaze dwell may be detected in the preceding step 604. Thus ... phoneme recognition is both audio-derived and visually disambiguated with visemes derived from the lip movements of an algorithmically detected speaker." See Figure 6, which outlines activating a lip reading functionality based off of gazing at a human speaker) [in response to at least one sensor of the wearable device detecting a person within a threshold vicinity of the at least one sensor].
von Liechtenstein does not explicitly disclose all of activating functionality based off of a threshold vicinity to a sensor.
However, Croxford et al. disclose wherein the memory device further comprises instructions that, when executed by the processor, cause the processor to activate a silent dictation mode in response to at least one sensor of the wearable device detecting a person within a [threshold] vicinity of the at least one sensor (Croxford et al. ¶ [0049]-[0050], "FIG. 2 illustrates a pair of smart glasses 202 ... The smart glasses 202 include a central frame portion 213 ... the central frame portion 213 houses two front-facing cameras 212a, 212b" ¶ [0055]-[0061], "the front-facing cameras 212a and 212b record images and store the images in memory 108. ... computer vision processing is a person detection process to determine if there is a person within the image. ... if a person is detected, ... video is captured by the cameras in a direction that includes a direction in which the person was detected. ... a mouth portion of a face is identified and localised. ... The localised video of the mouth region is analysed by performing lip reading to recognise what the person is saying based on lip movements in the video of the mouth region and recording the recognised speech as text." a lip reading mode activated by detecting a person within an image is considered analogous to a silent dictation mode activated by detecting a person within a vicinity of a sensor of a HMD).
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify von Liechtenstein’s dictation-enabled devices to incorporate Croxford et al.’s distance-activated silent dictation mode. 
The suggestion/motivation for doing so would have been that, “The smart glasses 202 may be particularly suitable for use by a hearing-impaired user as the additional information provided may allow the hearing-impaired user to understand speech in situations in which it would otherwise be difficult,” as noted by the Croxford et al. disclosure in ¶ [0086].
von Liechtenstein in view of Croxford et al. do not explicitly disclose all of a threshold vicinity.
However, Saleh discloses a threshold vicinity (Saleh ¶ [0040]-[0042], "the smart face protective device (SFPD) comprises: ... a Health & Safety Monitoring Unit” ¶ [0026], "the Health & Safety Monitoring Unit further comprises a social distance detector for detecting presence of an intruding person within a predefined minimum social safety distance threshold of the user, and wherein the notification unit sends a notification signal (alarm signal) when the intruding person is detected within the predefined minimum social safety distance threshold.")
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify von Liechtenstein in view of Croxford et al. to include Saleh’s threshold vicinity because such a modification is the result of combining prior art elements according to known methods to yield predictable results. More specifically, Croxford et al.’s distance-activated silent dictation mode as modified by Saleh’s threshold vicinity can yield a predictable result of improving user experience since the ability to set a threshold vicinity for activating silent dictation mode would give the user more options to control and tune their HMD to their own preferences. Thus, a person of ordinary skill would have appreciated including in von Liechtenstein in view of Croxford et al. the ability to do Saleh’s threshold vicinity since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACOB B VOGT whose telephone number is (571)272-7028. The examiner can normally be reached Monday - Friday 9:30am - 7pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Paras D Shah can be reached at (571)270-1650. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JACOB B VOGT/               Examiner, Art Unit 2653                                                                                                                                                                                         

/Paras D Shah/               Supervisory Patent Examiner, Art Unit 2653                                                                                                                                                                                         

09/18/2025
Read full office action
ELECTRONIC DEVICE WITH DICTATION STRUCTURE

This examiner grants 57% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

ELECTRONIC DEVICE WITH DICTATION STRUCTURE

This examiner grants 57% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email