DETAILED ACTION
This action is in response to the remarks filed 02/23/2026. Claims 1 – 22 are pending and have
been examined.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed 02/23/2026 have been fully considered but they are not persuasive.
In remarks, Applicant lists Figures 3, 6A, 6B, 6C and 6D, and Paragraphs [0087], [0088], [0089], [0090], [0091], and [0092] of Agrawal (U.S. Pub. No. 2023/0276115), and states: “In other words, FIGS. 6A-6B of Agrawal and related paragraphs teach that the front camera 132A of the electronic device 100 captures the image of user 610 and displays it on the display 130. It should be noted that user 610's gaze is directed towards the electronic device 100. On the other hand, FIGS. 6C-6D of Agrawal and related paragraphs teach that the rear camera of the electronic device 100 captures the image of dog 626 and displays it on the display 130. It should be noted that user 610's gaze is directed towards dog 626. In short, the electronic device 100 of Agrawal can capture the direction of user 610's gaze and determine whether the image displayed on the display 130 comes from the front camera 132A or another rear camera based on the direction of the gaze.”
Agrawal teaches a method for selecting an active camera among a front facing and at least one rear facing camera. The method determines the first user’s eye gaze direction, and upon determining the eye gaze direction, selects a camera to output the imagery for display that the user is looking at. Therefore, if the first user is looking away from their screen, such as at their dog that is positioned behind the phone, then the rear camera would be selected for imagery output.
“In comparison, paragraph [0021] of the instant application recites that the second processing circuit 300 may determine that the remote user of the first video communication device 20 is gazing an image position, such as the image of an Object-of-Interest, in the second real-time images over a period of time. In other words, the eye gaze direction is a direction which the first user looks at the second real-time images, which is different from the user 610's gaze of FIGS. 6C-6D of Agrawal is directed towards dog 626 and not directed towards the electronic device 100. In this regard, applicant amends claims 4 and 17 to specify the technical feature of "the eye gaze direction is a direction which the first user looks at the second real-time images".”
This amended limitation of Claims 4 and 17 can be taught by Agrawal in view of Chu, “wherein the eye gaze direction is a direction which the first user looks at the second real-time images (see Agrawal Paragraph [0087], In one embodiment, video communication session 386 can be a video call. Front facing camera 132A has a field of view (FOV) 612 that captures an image stream 240 including first image 242 containing the face 614 of user 610. The face 614 of the user 610 includes a pair of eyes 616 that are looking in first gaze direction 260A. In FIG. 6A, the user is looking at display 130 embedded in front surface 176 of electronic device 100 and the eye gaze direction 260A is oriented toward display 130. First location 262A is at the same location as electronic device 100, and Figures 6A and 6B, item 616 is the eye gaze direction of the first user toward their display, which shows the video stream of a second user, item 634).” As seen in Figures 6A and 6B, the eye gaze direction of the first user is directed at the second user of the video call.
“On the other hand, both the network video communication device of the instant application and the electronic device 100 of Agrawal capture the local user's image and analyze first user information such as the user's eye gaze direction. However, the network video communication device of the instant application transmits the first user information 25 to other network video devices in the video conference. These other network video devices can then generate real-time adjusted images based on the first user information and send them back to the network video communication device, which can then display the real-time adjusted images on the display of the network video communication device. In short, the real-time adjusted images of the instant application are generated by other network video devices in the video conference based on the first user information of the local user, which is different from the display image source of Agrawal is selected by the local electronic device 100.
Therefore, Agrawal does not teach the features "receiving, by the transmission circuit, a second video signal, wherein the second video signal comprises a second real-time image captured by a second video communication device which is another one of the participants of the online video conference;...controlling, by the processing circuit, the transmission circuit to transmit the first user information to the second video communication device; receiving, by the transmission circuit, the second video signal comprising the adjusted real-time image with an adjusted field of view, wherein the adjusted field of view is directed to a focus area corresponding to the eye gaze direction of the first user" in claim 1 of the instant application. Note that, Chu cannot cure the deficiency of Agrawal.
Thus, claim 1 includes features not disclosed nor taught by Agrawal, Chu, or their combination, and should be allowable. Claims 9 and 16 include limitations related to claim 1, and should be allowable. Claims 2-8, 10-15 and 17-22 are dependent on claims 1, 9 and 16, respectively, and should be allowable if claims 1, 9 and 16 are allowed.”
When interpreting Claim 1 under BRI, Agrawal does teach the respective features as stated in Non-Final Rejection filed 11/16/2025. Agrawal teaches receiving, by the transmission circuit, the second video signal comprising the adjusted real-time image with an adjusted field of view, wherein the adjusted field of view is directed to a focus area corresponding to the eye gaze direction of the first user (see Agrawal Figures 6A – 6F, in which a second video signal is received from second user in video communication, Paragraph [0089], The captured video image 630 of user 610 (first user) captured by front facing camera 132A is shown in captured image/video window 632 on display 130. The captured video image 630 can be transmitted to second electronic device 300 as part of video communication session 386. A received image or video stream 634 of a user of second electronic device 300 is shown in received image/video window 636 on display 130, and Paragraph [0041], Communication module 138 includes program code that is executed by processor 102 to enable electronic device 100 to communicate with other external devices and systems, and Figures 6C and 6D, in which the second video signal is displayed on the interface with the updated real-time image with an adjusted field of view, in which the first user has directed his attention (detected eye gaze) to his dog, and the rear camera is activated to collect real-time video that is transmitted to the call);
controlling, by the processing circuit, the display to display the second real-time image with the adjusted field of view in the video conference screen (see Agrawal Figures 6A – 6F, in which a second video signal is received from second user in video communication, Paragraph [0089], The captured video image 630 of user 610 (first user) captured by front facing camera 132A is shown in captured image/video window 632 on display 130. The captured video image 630 can be transmitted to second electronic device 300 as part of video communication session 386. A received image or video stream 634 of a user of second electronic device 300 is shown in received image/video window 636 on display 130, and Paragraph [0041], Communication module 138 includes program code that is executed by processor 102 to enable electronic device 100 to communicate with other external devices and systems, and Figures 6C and 6D, in which the second video signal is displayed on the interface with the updated real-time image with an adjusted field of view, in which the first user has directed his attention (detected eye gaze) to his dog, and the rear camera is activated to collect real-time video that is transmitted to the call);
wherein the second real-time image with the adjusted field of view is transmitted to the participants of the online video conference (see Agrawal Figures 6A – 6F, in which a second video signal is received from second user in video communication, Paragraph [0089], The captured video image 630 of user 610 (first user) captured by front facing camera 132A is shown in captured image/video window 632 on display 130. The captured video image 630 can be transmitted to second electronic device 300 as part of video communication session 386. A received image or video stream 634 of a user of second electronic device 300 is shown in received image/video window 636 on display 130, and Paragraph [0041], Communication module 138 includes program code that is executed by processor 102 to enable electronic device 100 to communicate with other external devices and systems, and Figures 6C and 6D, in which the second video signal is displayed on the interface with the updated real-time image with an adjusted field of view, in which the first user has directed his attention (detected eye gaze) to his dog, and the rear camera is activated to collect real-time video that is transmitted to the call). Agrawal expressively teaches the limitations above under BRI, and therefore Claims 1, 9 and 16 remain as rejected under 35 USC 103, as well as claims 2 – 8, 10 – 15, and 17 - 22.
Response to Amendment
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 2, 8 – 10, 13, 14, 16, 21 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Agrawal et al. (U.S. Pub. No. 2023/0276115, hereinafter “Agrawal”) in view of Chu et al. (U.S. Pub. No. 2017/0046111, hereinafter “Chu”).
Regarding Claim 1, Agrawal teaches
A network video communication device (see Agrawal Figure 1A, communication electronic device, Paragraph [0003], Communication devices, such as cell phones, tablets, and laptops, are widely used for communication and data transmission. These communication devices support various communication modes/applications, such as text messaging, audio calling and video calling, may be used for a conference call (Paragraph [0084])), comprising:
a transmission circuit (see Agrawal Paragraph [0041], Communication module 138 includes program code that is executed by processor 102 to enable electronic device 100 to communicate with other external devices and systems);
a display (see Agrawal Figure 1A, display 130);
an image capture circuit, for capturing a first real-time image (see Agrawal Front ICD (main and wide cameras) and rear ICD (main, wide, and telephoto cameras)); and
a processing circuit (see Agrawal Figure 1a, processor), coupled to the transmission circuit, the image capture circuit and the display (see Agrawal Figure 1A, processor coupled to system memory, which includes communication module, display, and image capture device controller, which is connected to the front and rear cameras); wherein the network video communication device performs following steps:
controlling, by the processing circuit, the transmission circuit to connect a server and join an online video conference as one of a plurality of participants of the online video conference (see Agrawal Paragraph [0048], wireless network 150 can include one or more servers 190 that support exchange of wireless data and video and other communication between electronic device 100 and second electronic device 300, Paragraph [0041], Communication module 138 includes program code that is executed by processor 102 to enable electronic device 100 to communicate with other external devices and systems, and Paragraph [0126], Method 1000 begins at the start block 1002 (video communication begins and users join). At block 1004, processor 102 detects that video data 270 is being captured by the current active camera);
controlling, by the processing circuit, the image capture circuit to capture the first real-time image of a first user local to the network video communication device (see Agrawal Figure 10 and Paragraph [0126], Processor 102 triggers front facing camera 132A to capture (monitor) an image stream 240 including second image 244 containing a face 614 of user 610 (first user) (block 1006) and 1008, receive image stream of face of user (first real-time image), and audio is captured using microphone 108, Paragraph [0060], Audio communication input 222 is audio received via microphone 108, and Paragraph [0058], Communication module 138 enables electronic device 100 to communicate with wireless network 150 and with other devices, such as second electronic device 300, via one or more of audio, text, and video communications. Communication module 138 supports various communication sessions by electronic device 100, such as audio communication sessions, video communication sessions, text communication sessions, communication device application communication sessions, or a dual/combined audio/text/video communication session);
receiving, by the transmission circuit, a second video signal, wherein the second video signal comprises a second real-time image captured by a second video communication device which is another one of the participants of the online video conference (see Agrawal Figures 6A – 6F, in which a second video signal is received from second user in video communication, Paragraph [0089], The captured video image 630 of user 610 (first user) captured by front facing camera 132A is shown in captured image/video window 632 on display 130. The captured video image 630 can be transmitted to second electronic device 300 as part of video communication session 386. A received image or video stream 634 of a user of second electronic device 300 is shown in received image/video window 636 on display 130, and Paragraph [0041], Communication module 138 includes program code that is executed by processor 102 to enable electronic device 100 to communicate with other external devices and systems);
controlling, by the processing circuit, the display to display a video conference screen, wherein the video conference screen comprises the second real-time image (see Agrawal Figures 6A – 6F, in which a second video signal is received from second user in video communication, Paragraph [0089], The captured video image 630 of user 610 (first user) captured by front facing camera 132A is shown in captured image/video window 632 on display 130. The captured video image 630 can be transmitted to second electronic device 300 as part of video communication session 386. A received image or video stream 634 of a user of second electronic device 300 is shown in received image/video window 636 on display 130, may be used for a conference call (Paragraph [0084]));
receiving, by the transmission circuit, the second video signal comprising the adjusted real-time image with an adjusted field of view, wherein the adjusted field of view is directed to a focus area corresponding to the eye gaze direction of the first user (see Agrawal Figures 6A – 6F, in which a second video signal is received from second user in video communication, Paragraph [0089], The captured video image 630 of user 610 (first user) captured by front facing camera 132A is shown in captured image/video window 632 on display 130. The captured video image 630 can be transmitted to second electronic device 300 as part of video communication session 386. A received image or video stream 634 of a user of second electronic device 300 is shown in received image/video window 636 on display 130, and Paragraph [0041], Communication module 138 includes program code that is executed by processor 102 to enable electronic device 100 to communicate with other external devices and systems, and Figures 6C and 6D, in which the second video signal is displayed on the interface with the updated real-time image with an adjusted field of view, in which the first user has directed his attention (detected eye gaze) to his dog, and the rear camera is activated to collect real-time video that is transmitted to the call);
controlling, by the processing circuit, the display to display the second real-time image with the adjusted field of view in the video conference screen (see Agrawal Figures 6A – 6F, in which a second video signal is received from second user in video communication, Paragraph [0089], The captured video image 630 of user 610 (first user) captured by front facing camera 132A is shown in captured image/video window 632 on display 130. The captured video image 630 can be transmitted to second electronic device 300 as part of video communication session 386. A received image or video stream 634 of a user of second electronic device 300 is shown in received image/video window 636 on display 130, and Paragraph [0041], Communication module 138 includes program code that is executed by processor 102 to enable electronic device 100 to communicate with other external devices and systems, and Figures 6C and 6D, in which the second video signal is displayed on the interface with the updated real-time image with an adjusted field of view, in which the first user has directed his attention (detected eye gaze) to his dog, and the rear camera is activated to collect real-time video that is transmitted to the call);
wherein the second real-time image with the adjusted field of view is transmitted to the participants of the online video conference (see Agrawal Figures 6A – 6F, in which a second video signal is received from second user in video communication, Paragraph [0089], The captured video image 630 of user 610 (first user) captured by front facing camera 132A is shown in captured image/video window 632 on display 130. The captured video image 630 can be transmitted to second electronic device 300 as part of video communication session 386. A received image or video stream 634 of a user of second electronic device 300 is shown in received image/video window 636 on display 130, and Paragraph [0041], Communication module 138 includes program code that is executed by processor 102 to enable electronic device 100 to communicate with other external devices and systems, and Figures 6C and 6D, in which the second video signal is displayed on the interface with the updated real-time image with an adjusted field of view, in which the first user has directed his attention (detected eye gaze) to his dog, and the rear camera is activated to collect real-time video that is transmitted to the call).
Agrawal does not expressively teach
determining, by the processing circuit, an identity or a status of the first user corresponding to the online video conference;
processing, by the processing circuit, the first real-time image of the first user to generate a first user information, wherein the first user information comprises an eye gaze direction of the first user if the identity or the status of the first user meets a requirement;
controlling, by the processing circuit, the transmission circuit to transmit the first user information to the second video communication device;
However, Chu teaches
determining, by the processing circuit, an identity or a status of the first user corresponding to the online video conference (see Chu Paragraph [0012], transmitting at least one of information input to the electronic device in association with the contents and status information of the electronic device to the external electronic device, Paragraph [0135], the instructions may be executed in such a way that the processor 120 controls the communication circuit to receive information on the line of sight of the user of the electronic device 101 to at least part of the contents and transmits to the external electronic device the line-of-sight information as the status information of the electronic device 101, and Paragraph [0161] the server 420 may forward the acknowledgement message received from the second electronic device 410 to the first electronic device 400 and perform authentication on the first and second electronic devices 400 and 410 simultaneously or sequentially. For example, the server may determine whether the first and second electronic devices 400 and 410 are registered with the server 420 for the second communication (e.g., SWIS communication). According to various embodiments, the server 420 may determine whether the first and second electronic devices 400 and 410 are registered with the server 420 based on the identity information received therefrom. The identity information may be phone number or unique device information of the first and second electronic devices 400 and 410);
processing, by the processing circuit, the first real-time image of the first user to generate a first user information, wherein the first user information comprises an eye gaze direction of the first user if the identity or the status of the first user meets a requirement (see Chu Paragraph [0135], the instructions may be executed in such a way that the processor 120 controls the communication circuit to receive information on the line of sight of the user of the electronic device 101 to at least part of the contents and transmits to the external electronic device the line-of-sight information as the status information of the electronic device 101);
controlling, by the processing circuit, the transmission circuit to transmit the first user information to the second video communication device (see Chu Paragraph [0135], the instructions may be executed in such a way that the processor 120 controls the communication circuit to receive information on the line of sight of the user of the electronic device 101 to at least part of the contents and transmits to the external electronic device the line-of-sight information as the status information of the electronic device 101, and Figure 10B, in which interface is transmitted a notification of information regarding other user in the communication, which states the counterpart is not paying attention to the screen, according to line-of-sight information);
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of an electronic device that selects an active field of view for a camera to output in a virtual call corresponding to the user’s gaze (as taught in Agrawal), with determining a user’s identity or status and processing and transmitting user information in a video call according to the user’s gaze (as taught in Chu), the motivation being to ensure unauthorized personnel do not have access to the communication, and additionally avoid a lapse in productivity or communication by providing user’s gaze information to the other participant(s) in the call (see Chu Paragraph [0161] and Figure 10B).
Regarding Claim 2, Agrawal in view of Chu teaches
The network video communication device of claim 1, wherein the image capture circuit comprises a plurality of photographic lenses for capturing a plurality of images, wherein the processing circuit selects at least one of the captured images to be processed and generates the first real-time image (see Agrawal Paragraph [0003], With communication devices that have multiple rear facing cameras, the rear facing cameras can have lenses that are optimized for various focal angles and distances. For example, one rear facing camera can have a wide angle lens, another rear facing camera can have a telephoto lens, and an additional rear facing camera can have a macro lens and Paragraph [0028], The electronic device further includes at least one processor that is communicatively coupled to each of the at least one front facing camera, each of the at least one rear facing camera, and to the memory. The at least one processor executes program code of the CSCM, which enables the electronic device to capture, via the at least one front facing camera, a first image stream containing a face of a first user. A first eye gaze direction of the first user is determined based on a first image retrieved from the first image stream. The first eye gaze direction corresponds to a first location where the first user is looking. In response to determining that the first user is looking away from the front surface of the electronic device and towards a direction within a field of view (FOV) of the at least one rear facing camera, the processor selects an active camera that corresponds to one of the at least one rear facing cameras with a FOV containing the first location to which the first user is looking).
Regarding Claim 8, Agrawal in view of Chu teaches
The network video communication device of claim 1, wherein the identity or the status of the first user meets the requirement when the identity of the first user is a host of the online video conference or the status of the first user is current speaker of the online video conference (see Agrawal Paragraph [0084], communication input 220 (FIG. 2) can be received contemporaneously with the detection of the eye-gaze direction of the user and used to generate context identifying data 254. In another example illustrated by the first camera selecting row of context table, if the user is looking away from the display 130 in a direction (or towards a location) that is behind the electronic device, while the user's speech is analyzed by NLP/CE engine 212 to provide context identifying data 254 including context identifier 510 as “self-image” and context type 520 is “speaker”, the selected camera would remain front facing camera 132A as indicated by an “X” in the box under the front facing camera 132A in the first row of context table 252. This prevents the electronic device from incorrectly switching to a rear camera when the user becomes distracted with something in the background that the user does not intend to share with the second user. As an example of such a distraction, a child or pet coming into a room while the user is on a conference call, would cause the user's eye gaze to shift away from the device display; However, the user does not want to show to the other participants on the video conference the child or dog that has entered into the FOV of the rear camera(s), in which first user’s speech is used to identify they are a speaker, and continue to use front camera displaying the user’s face).
Regarding Claim 9, it is rejected similarly as Claim 1.
Regarding Claim 10, Agrawal in view of Chu teaches
The network video communication device of claim 9, wherein the image capture circuit comprises a plurality of photographic lenses for capturing a plurality of images (see Agrawal Paragraph [0003], With communication devices that have multiple rear facing cameras, the rear facing cameras can have lenses that are optimized for various focal angles and distances. For example, one rear facing camera can have a wide angle lens, another rear facing camera can have a telephoto lens, and an additional rear facing camera can have a macro lens and Paragraph [0028], The electronic device further includes at least one processor that is communicatively coupled to each of the at least one front facing camera, each of the at least one rear facing camera, and to the memory. The at least one processor executes program code of the CSCM, which enables the electronic device to capture, via the at least one front facing camera, a first image stream containing a face of a first user. A first eye gaze direction of the first user is determined based on a first image retrieved from the first image stream. The first eye gaze direction corresponds to a first location where the first user is looking. In response to determining that the first user is looking away from the front surface of the electronic device and towards a direction within a field of view (FOV) of the at least one rear facing camera, the processor selects an active camera that corresponds to one of the at least one rear facing cameras with a FOV containing the first location to which the first user is looking), wherein the processing circuit selects at least one of the captured images according to the first user information to merge into the second real-time image (see Agrawal Paragraph [0090] Referring to FIG. 6C, electronic device 100 is further shown continuing video communication session 386 by user 610. In FIG. 6C, user 610 has looked away from front surface 176 and display 130 in a second eye gaze direction 260B towards dog 626. Dog 626 is located in second location 262B, which is offset to the rear of electronic device 100 at a distance 644 away from electronic device 100. Dog 626 is located facing rear surface 180 and is in a FOV 646 of at least one of rear facing cameras 133 A-C (FIG. 1C). A secondary display device 648 is communicatively coupled to electronic device 100. In one embodiment, electronic device 100 can transmit captured video data 270 to secondary display device 648 for presentation of images/video thereon, Paragraph [0091] CSCM 136 enables electronic device 100 to capture, via front facing camera 132A, an image stream 240 including second image 244 containing a face 614 of user 610 and determine second eye gaze direction 260B, by processor 102 processing second image 244 retrieved from image stream 240. Second eye gaze direction 260B corresponds to second location 262B where the user is looking. In one or more embodiments, CSCM 136 enables electronic device 100 to determine second location 262B at least partially based on second eye gaze direction 260B and on camera parameters 264, including distances to objects within a FOV, and Paragraph [0092], CSCM 136 further enables processor 102 of electronic device 100 to determine if the user is looking away from the front surface 176 and towards a location behind electronic device 100 (i.e., towards a location that can be captured within a FOV of at least one of the rear facing cameras) for more than a threshold amount of time. In response to processor 102 determining that the user's eye gaze is in a direction within the FOV of at least one of the rear cameras, and in part based on a context determined from processing, by NLP/CE engine, of detected speech of the user, processor 102 triggers ICD controller 134 (FIG. 1A) to select as an active camera, a corresponding one of at least one rear facing cameras 133A-133C with a FOV 646 towards second eye gaze direction 260B and containing second location 262B to which the user 610 is looking. In the example of FIG. 6C, rear facing main camera 133A can be selected as the active camera with a FOV 646 that includes dog 626 when the user says “my dog is so adorable” contemporaneously with fixing his/her gaze in direction of dog 626. CSCM 136 further enables electronic device 100 to activate rear facing main camera 133A and to capture images (e.g., image/video data 270) within the FOV 646 of the activated camera).
Regarding Claim 13, Agrawal in view of Chu teaches
The network video communication device of claim 9, wherein the first identity of the remote user is a host of the online video conference, and the first status of the remote user is current speaker in the online video conference (see Agrawal Paragraph [0084], communication input 220 (FIG. 2) can be received contemporaneously with the detection of the eye-gaze direction of the user and used to generate context identifying data 254. In another example illustrated by the first camera selecting row of context table, if the user is looking away from the display 130 in a direction (or towards a location) that is behind the electronic device, while the user's speech is analyzed by NLP/CE engine 212 to provide context identifying data 254 including context identifier 510 as “self-image” and context type 520 is “speaker”, the selected camera would remain front facing camera 132A as indicated by an “X” in the box under the front facing camera 132A in the first row of context table 252. This prevents the electronic device from incorrectly switching to a rear camera when the user becomes distracted with something in the background that the user does not intend to share with the second user. As an example of such a distraction, a child or pet coming into a room while the user is on a conference call, would cause the user's eye gaze to shift away from the device display; However, the user does not want to show to the other participants on the video conference the child or dog that has entered into the FOV of the rear camera(s), in which first user’s speech is used to identify they are a speaker, and continue to use front camera displaying the user’s face and Chu Paragraph [0157], the first electronic device 400 may communicate with the second electronic device 410 via the first communication channel 430 at operation 610. For example, the first user may start a voice call with the second user via the first communication channel 430 and Paragraph [0115], the server 420 may check the identifiers such as IDs of the first electronic device 400 and/or the second electronic device 410 and may perform authentication on the first electronic device 400 and/or second electronic device 410).
Regarding Claim 14, Agrawal in view of Chu teaches
The network video communication device of claim 9, further comprising: a display,
wherein the transmission circuit receives a video signal including a remote real-time image (see Agrawal Paragraph [0058], Communication module 138 enables electronic device 100 to communicate with wireless network 150 and with other devices, such as second electronic device 300, via one or more of audio, text, and video communications. Communication module 138 supports various communication sessions by electronic device 100, such as audio communication sessions, video communication sessions, text communication sessions, communication device application communication sessions, or a dual/combined audio/text/video communication session) and the processing circuit controls the display to display the remote real-time image in a video conference screen (see Agrawal Figure 6B, in which real-time video of first user is displayed on call interface (item 630)).
Regarding Claim 16, it is rejected similarly as Claim 1. The method can be found in Agrawal (Abstract, method).
Regarding Claim 21, it is rejected similarly as Claim 8. The method can be found in Agrawal (Abstract, method).
Regarding Claim 22, Agrawal in view of Chu teaches
The method of video conference image processing of claim 16, wherein the step of displaying the video conference screen including the second real-time image with the adjusted field of view comprises:
receiving a priority for the second real-time image with the adjusted field of view (see Agrawal Paragraph [0119] With specific reference to FIG. 9A, method 900 begins at the start block 902. At block 904, processor 102 detects that video data 270 is being captured by the current active camera (i.e., one of front facing cameras 132A-132B or rear facing cameras 133A-133C). Processor 102 triggers front facing camera 132A to capture an image stream 240 including second image 244 containing a face 614 of user 610 (block 906). Processor 102 receives image stream 240 including second image 244 (block 908). Processor 102 determines second eye gaze direction 260B and corresponding second location 262B based on second image 244 retrieved from the image stream 240 (block 910). Processor 102 stores second eye gaze direction 260B and second location 262B to system memory 120 (block 911). The second eye gaze direction 260B corresponds to second location 262B where the user is looking. The second location 262B is determined at least partially based on second eye gaze direction 260B, Paragraph [0120], Processor 102 retrieves eye gaze threshold time 218 and starts timer 216 (block 912). Timer 216 tracks the amount of time that a user's eye gaze has rested/remained on a specific area that is away from the display 130 at front surface 176 of electronic device 100. Processor 102 determines if user 610 is looking away from the display 130 at front surface 176 of electronic device 100 and towards a direction within a FOV of the rear facing cameras (decision block 914). In response to determining that user 610 is not looking away from the display 130, processor 102 captures video data 270 using the current active camera (block 928). Method 900 then ends at end block 930. In response to determining that user 610 is looking away from the front surface 176 and towards a direction behind the rear surface 180 of electronic device 100, processor 102 determines if the value of timer 216 is greater than eye gaze threshold time 218 (decision block 916). Eye gaze threshold time 218 is a minimum amount of time that a user is looking away from front surface 176 and display 130 to trigger a switch of active cameras to one of the rear facing cameras. In response to determining that the value of timer 216 is not greater than eye gaze threshold time 218, processor 102 returns to decision block 914 to continue determining if user 610 is looking away from the display 130, and Paragraph [0121], In response to determining that the value of timer 216 is greater than eye gaze threshold time 218, processor 102 selects to activate as an active camera, a corresponding one of the rear facing cameras 133A-133C based on second location 262B and identified characteristics of the second location 262B (block 918). The selected rear facing camera has a FOV 646 towards second eye gaze direction 260B and containing second location 262B to which the user 610 is looking. If electronic device 100 has only one rear facing camera (i.e., rear facing camera 133A), processor 102 selects the single rear facing camera. Processor 102 retrieves camera parameters 264 and camera settings 266 for the selected rear facing camera (block 920), in which when there is a gaze that exceeds the timer, it is the priority gaze, thus the priority field of view to be displayed); and
arranging a screen area in the video conference screen corresponding to the received priority (see Agrawal Figure 11, capture video data of eye gaze location using active rear camera, transmit video to 2nd electronic device).
Claims 3 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Agrawal et al. (U.S. Pub. No. 2023/0276115, hereinafter “Agrawal”) in view of Chu et al. (U.S. Pub. No. 2017/0046111, hereinafter “Chu”) and Geerds (U.S. Pub. No. 2014/0267596).
Regarding Claim 3, Agrawal in view of Chu teaches all the limitations of claim 1, but does not expressively teach
The network video communication device of claim 1, wherein the image capture circuit comprises an upper photographic lens, a bottom photographic lens, a middle photographic lens, a left photographic lens and a right photographic lens, and the image capture circuit covers a field of view more than 130 degrees horizontally and 105 degrees vertically.
However, Geerds teaches
The network video communication device of claim 1, wherein the image capture circuit comprises an upper photographic lens, a bottom photographic lens, a middle photographic lens, a left photographic lens and a right photographic lens, and the image capture circuit covers a field of view more than 130 degrees horizontally and 105 degrees vertically (see Geerds Figure 1, image capture circuit comprising 6 lenses item 110, in which there is a top lens, bottom lens, and 4 side lenses (could be considered middle, left, or right, depending on orientation) and Paragraph [0032], geometry of the disclosed camera system makes it possible to produce a fully 360 by 180 degrees, spherical image or fully spherical video).
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of an authenticated electronic device that selects an active field of view for a camera to output in a virtual call corresponding to the user’s gaze, and provides user information to other users in the call according to the user’s gaze (as taught in Agrawal in view of Chu), with a camera that has multiple lenses in various directions that captures a very large and wide field of view (as taught in Geerds), the motivation being to address the issue of a lack of field of view provided by a singular camera, and ensure areas of interest are captured (see Geerds Paragraph [0003]).
Regarding Claim 11, it is rejected similarly as Claim 3.
Claims 4, 12 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Agrawal et al. (U.S. Pub. No. 2023/0276115, hereinafter “Agrawal”) in view of Chu et al. (U.S. Pub. No. 2017/0046111, hereinafter “Chu”) and Teshome et al. (U.S. Pub. No. 2016/0378183, hereinafter “Teshome”).
Regarding Claim 4, Agrawal in view of Chu teaches
The network video communication device of claim 1, wherein the step of processing, by the processing circuit, the first real-time image of the first user to generate the first user information comprises:
wherein the eye gaze direction is a direction which the first user looks at the second real-time images (see Agrawal Paragraph [0087], In one embodiment, video communication session 386 can be a video call. Front facing camera 132A has a field of view (FOV) 612 that captures an image stream 240 including first image 242 containing the face 614 of user 610. The face 614 of the user 610 includes a pair of eyes 616 that are looking in first gaze direction 260A. In FIG. 6A, the user is looking at display 130 embedded in front surface 176 of electronic device 100 and the eye gaze direction 260A is oriented toward display 130. First location 262A is at the same location as electronic device 100, and Figures 6A and 6B, item 616 is the eye gaze direction of the first user toward their display, which shows the video stream of a second user, item 634).
Agrawal in view of Chu does not expressively teach
performing edge detection on the first real-time image for detecting a user face position;
detecting eye characteristics for determining a user eye position corresponding to the user face position on the first real-time image; and
determining the eye gaze direction according to the user eye position;
However, Teshome teaches
performing edge detection on the first real-time image for detecting a user face position (see Teshome Paragraph [0074], the image processor 130 may detect a position of an eye using a method (for example, Haar-like features) of finding object features from the image. As a result, the image processor 130 may detect the iris and the pupil from the position of the eye using an edge detection method, and detect the point of gaze of the user based on relative positions of the iris and the pupil);
detecting eye characteristics for determining a user eye position corresponding to the user face position on the first real-time image (see Teshome Paragraph [0074], the image processor 130 may detect a position of an eye using a method (for example, Haar-like features) of finding object features from the image. As a result, the image processor 130 may detect the iris and the pupil from the position of the eye using an edge detection method, and detect the point of gaze of the user based on relative positions of the iris and the pupil); and
determining the eye gaze direction according to the user eye position (see Teshome Paragraph [0074], the image processor 130 may detect a position of an eye using a method (for example, Haar-like features) of finding object features from the image. As a result, the image processor 130 may detect the iris and the pupil from the position of the eye using an edge detection method, and detect the point of gaze of the user based on relative positions of the iris and the pupil);
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of an authenticated electronic device that selects an active field of view for a camera to output in a virtual call corresponding to the user’s gaze, and provides user information to other users in the call according to the user’s gaze (as taught in Agrawal in view of Chu), with determining a user’s gaze using edge detection to find positions of the user’s eyes (as taught in Teshome), the motivation being to provide a solution to creating a more realistic meeting experience in a virtual communication, by applying a process that determines a user’s gaze in order to implement a correction if needed (see Teshome Paragraphs [0005] – [0008]).
Regarding Claim 12, Agrawal in view of Chu and Teshome teach
The network video communication device of claim 9, wherein the network video communication device further performs following step:
performing, by the processing circuit, edge detection on the second real-time image with the adjusted field of view (see Teshome Paragraph [0074], the image processor 130 may detect a position of an eye using a method (for example, Haar-like features) of finding object features from the image. As a result, the image processor 130 may detect the iris and the pupil from the position of the eye using an edge detection method, and detect the point of gaze of the user based on relative positions of the iris and the pupil) and determining an object-of-interest in the second real-time image with the adjusted field of view (see Agrawal Figure 10, in which process 1000 can be repeated to track the eye gaze direction of the user, Paragraph [0066], Camera settings 266 are values and characteristics that can change during the operation of cameras 132A-132B and 133A-133C to capture images by the cameras. In one embodiment, camera settings 266 can be determined by either processor 102 or by ICD controller 134. Camera settings 266 can include various settings such as aperture, shutter speed, iso level, white balance, zoom level, directional settings (i.e., region of interest (ROI)), distance settings, focus and others. Camera settings 266 can include optimal camera settings 268 that are selected to optimize the quality of the images captured by the cameras. Optimal camera settings 268 can include zoom levels, focus distance, and directional settings that allow images within a camera FOV and that are desired to be captured to be in focus, centered, and correctly sized. Optimal camera settings 268 can further include digital crop levels, focal distance of the focus module and directional audio zoom. The directional audio zoom enables microphone 108 to be tuned to receive audio primarily in the direction of a cropped FOV in a desired ROI, and Paragraph [0067], Image characteristics 272 can further include computer vision (CV), machine learning (ML) and artificial intelligence (AI) based techniques for determining an object of interest within the eye gaze area ROI); and
controlling, by the processing circuit, the image capture circuit to zoom-in and capture an enlarged image of the object-of-interest as the second real-time image with the adjusted field of view (see Agrawal Paragraph [0066], Camera settings 266 are values and characteristics that can change during the operation of cameras 132A-132B and 133A-133C to capture images by the cameras. In one embodiment, camera settings 266 can be determined by either processor 102 or by ICD controller 134. Camera settings 266 can include various settings such as aperture, shutter speed, iso level, white balance, zoom level, directional settings (i.e., region of interest (ROI)), distance settings, focus and others. Camera settings 266 can include optimal camera settings 268 that are selected to optimize the quality of the images captured by the cameras. Optimal camera settings 268 can include zoom levels, focus distance, and directional settings that allow images within a camera FOV and that are desired to be captured to be in focus, centered, and correctly sized. Optimal camera settings 268 can further include digital crop levels, focal distance of the focus module and directional audio zoom. The directional audio zoom enables microphone 108 to be tuned to receive audio primarily in the direction of a cropped FOV in a desired ROI).
Regarding Claim 17, it is rejected similarly as Claim 4. The method can be found in Agrawal (Abstract, method).
Claims 5 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Agrawal et al. (U.S. Pub. No. 2023/0276115, hereinafter “Agrawal”) in view of Chu et al. (U.S. Pub. No. 2017/0046111, hereinafter “Chu”), Teshome et al. (U.S. Pub. No. 2016/0378183, hereinafter “Teshome”) and Chakravarthula et al. (U.S. Pub. No. 2013/0057553, hereinafter “Chakravarthula”).
Regarding Claim 5, Agrawal in view of Chu and Teshome teach all the limitations of claim 4, but does not expressively teach
The network video communication device of claim 4, wherein the step of processing, by the processing circuit, the first real-time image of the first user to generate the first user information further comprises:
calculating a user distance based on user face size information corresponding to the user face position on the first real-time image and focal length information of the image capture circuit.
However, Chakravarthula teaches
The network video communication device of claim 4, wherein the step of processing, by the processing circuit, the first real-time image of the first user to generate the first user information further comprises:
calculating a user distance based on user face size information corresponding to the user face position on the first real-time image and focal length information of the image capture circuit (see Chakravarthula Paragraph [0048], the focal length of the camera can be used to determine the distance of the user from the display, or alternatively the focal length can be combined with detected features such as the size of the face or the relative size of facial features on the user to determine the distance of the user from the display).
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of an authenticated electronic device that selects an active field of view for a camera to output in a virtual call corresponding to the determined user’s gaze, and provides user information to other users in the call according to the user’s gaze (as taught in Agrawal in view of Chu and Teshome), with calculating user distance (as taught in Chakravarthula), the motivation being to adapt a display to a user by adjusting item size on the display according to the distance between the user and display (see Chakravarthula Paragraph [0048]).
Regarding Claim 18, it is rejected similarly as Claim 5. The method can be found in Agrawal (Abstract, method).
Claims 6 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Agrawal et al. (U.S. Pub. No. 2023/0276115, hereinafter “Agrawal”) in view of Chu et al. (U.S. Pub. No. 2017/0046111, hereinafter “Chu”), Teshome et al. (U.S. Pub. No. 2016/0378183, hereinafter “Teshome”) and Gavino et al. (U.S. Pub. No. 2019/0042834, hereinafter “Gavino”).
Regarding Claim 6, Agrawal in view of Chu and Teshome teach all the limitations of claim 4, but does not expressively teach
The network video communication device of claim 4, wherein the step of determining the eye gaze direction according to the user eye position comprises:
determining a horizontal face midline and a vertical face midline of a user face at the user face position; and
comparing the user's eye positions with the horizontal face midline and the vertical face midline of the user face.
However, Gavino teaches
The network video communication device of claim 4, wherein the step of determining the eye gaze direction according to the user eye position comprises:
determining a horizontal face midline and a vertical face midline of a user face at the user face position (see Gavino Paragraph [0069], The example position tracker 338 identifies a horizontal line representing one-third of the vertical measure from the top bounding line of the face bounding rectangle. In examples disclosed herein, the example position tracker 338 identifies the horizontal line by finding a line that is one-third of the distance from a top edge of the face bounding rectangle and a bottom edge of the face bounding rectangle. However, any other ratio for finding a horizontal line may additionally or alternatively be used. The example position tracker 338 calculates an intersection of the vertical centerline and the horizontal line to determine the estimated eye position); and
comparing the user's eye positions with the horizontal face midline and the vertical face midline of the user face (see Gavino Paragraph [0069], The example position tracker 338 identifies a horizontal line representing one-third of the vertical measure from the top bounding line of the face bounding rectangle. In examples disclosed herein, the example position tracker 338 identifies the horizontal line by finding a line that is one-third of the distance from a top edge of the face bounding rectangle and a bottom edge of the face bounding rectangle. However, any other ratio for finding a horizontal line may additionally or alternatively be used. The example position tracker 338 calculates an intersection of the vertical centerline and the horizontal line to determine the estimated eye position).
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of an authenticated electronic device that selects an active field of view for a camera to output in a virtual call corresponding to the determined user’s gaze, and provides user information to other users in the call according to the user’s gaze (as taught in Agrawal in view of Chu and Teshome), with mapping a horizontal and vertical line on a user’s face and comparing the positions of their eyes to determine the user’s eye positions (as taught in Gavino), the motivation being to simply identify a position of a person’s eyes in order to generate corresponding data (see Gavino Paragraph [0069]).
Regarding Claim 19, it is rejected similarly as Claim 6. The method can be found in Agrawal (Abstract, method).
Claims 7 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Agrawal et al. (U.S. Pub. No. 2023/0276115, hereinafter “Agrawal”) in view of Chu et al. (U.S. Pub. No. 2017/0046111, hereinafter “Chu”) and Norris et al. (U.S. Pub. No. 2015/0373477, hereinafter “Norris”).
Regarding Claim 7, Agrawal in view of Chu teaches all the limitations of claim 1, but does not expressively teach
The network video communication device of claim 1, further comprising:
a microphone, wherein the microphone receives a first real-time audio local to the network video communication device, and the processing circuit processes the first real-time audio to determine a direction and the status of the first user.
However, Norris teaches
The network video communication device of claim 1, further comprising:
a microphone, wherein the microphone receives a first real-time audio local to the network video communication device, and the processing circuit processes the first real-time audio to determine a direction and the status of the first user (see Norris Paragraph [0112], an electronic device can intelligently assign locations for one or more sound localization points or virtual microphone points. Selection of the location can be based on, for example, available space near the listener, location of another person, previous assignments of sound localization or virtual microphone points, type or origin of the sound, environment in which the listener is located, objects near the person, a social status or personal characteristics of a person, a person with whom the listener is communicating, time of arrival or reservation or other time-related property, etc.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of an authenticated electronic device that selects an active field of view for a camera to output in a virtual call corresponding to the user’s gaze, and provides user information to other users in the call according to the user’s gaze (as taught in Agrawal in view of Chu), with using a user’s audio to determine the direction and status of the user (as taught in Norris), the motivation being to provide a solution to virtual meetings feeling unnatural and impersonal, and use audio data to collect information (see Norris Paragraphs [0001] and [0112]).
Regarding Claim 20, it is rejected similarly as Claim 7. The method can be found in Agrawal (Abstract, method).
Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Agrawal et al. (U.S. Pub. No. 2023/0276115, hereinafter “Agrawal”) in view of Chu et al. (U.S. Pub. No. 2017/0046111, hereinafter “Chu”) and Cranfill et al. (U.S. Pub. No. 2011/0249074, hereinafter “Cranfill”).
Regarding Claim 15, Agrawal in view of Chu teaches all the limitations of claim 9, but does not expressively teach
The network video communication device of claim 9, wherein the network video communication device further performs the following steps:
controlling, by the processing circuit, the image capture circuit to capture the first real-time image with a previous field of view and the second real-time image with the adjusted field of view at the same time; and
transmitting, by the transmission circuit, the video signal including the first real-time image with the previous field of view and the second real-time image with the adjusted field of view to the at least one of the other participants of the online video conference.
However, Cranfill teaches
The network video communication device of claim 9, wherein the network video communication device further performs the following steps:
controlling, by the processing circuit, the image capture circuit to capture the first real-time image with a previous field of view and the second real-time image with the adjusted field of view at the same time (see Cranfill Paragraph [0673], selection of the "Select L1" button 7420 would cause the UI 7475 to display only the video captured by the local device's back camera (being presented in the foreground inset display 7410). Selection of the "Select L2" button 7425 would cause the UI 7475 to display only the video captured by the local device's front camera (being presented in the foreground inset display 7405). Selection of the "Select Both" button 7430 would cause the UI 7475 to continue displaying both videos captured by both cameras on the local device, and selecting the "Cancel" button 7485 would cancel the operation); and
transmitting, by the transmission circuit, the video signal including the first real-time image with the previous field of view and the second real-time image with the adjusted field of view to the at least one of the other participants of the online video conference (see Cranfill Paragraph [0687] and Figure 75, the set of selectable UI items includes: a "Transmit L1" item 7528 (e.g. button 7528); a "Transmit L2" item 7530 (e.g. button 7530); a "Transmit Both" item 7532 (e.g. button 7532); and a "Cancel" item 7534 (e.g. button 7534). In this example, selection of the "Transmit L1" button 7528 would cause the UI 7500 to transmit only the video captured by the device's back camera to the remote device during the video conference. Selection of the "Transmit L2" button 7530 would cause the UI 7500 to transmit only the video captured by the device's front camera to the remote device during the video conference. Selection of the "Transmit Both" button 7532 would cause the UI 7500 to transmit both videos captured by the device's front and back camera to the remote user for the video conference, and selecting the "Cancel" button 7534 would cancel the operation).
It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of an authenticated electronic device that selects an active field of view for a camera to output in a virtual call corresponding to the user’s gaze, and provides user information to other users in the call according to the user’s gaze (as taught in Agrawal in view of Chu), with transmitting two field of views from a camera in a communication session (as taught in Cranfill), the motivation being to environment a more inclusive and informational meeting to allow a user to display multiple views or items at once (see Cranfill Paragraph [0687], Paragraph [0001] –[0002] and Figure 75).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Refer to PTO-892, Notice of References Cited for a listing of analogous art.
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CARISSA A JONES whose telephone number is (703)756-1677. The examiner can normally be reached Telework M-F 6:30 AM - 4:00 PM CT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached at 5712727503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CARISSA A JONES/Examiner, Art Unit 2691
/DUC NGUYEN/Supervisory Patent Examiner, Art Unit 2691