Last updated: April 19, 2026

Application No. 18/724,104

SYSTEMS AND METHODS FOR VIRTUAL REALITY IMMERSIVE CALLING

Non-Final OA §102§103

Filed

Jun 25, 2024

Examiner

TIEU, BINH KIEN

Art Unit

2694

Tech Center

2600 — Communications

Assignee

Canon Kabushiki Kaisha

OA Round

1 (Non-Final)

Interview Optional

— +9.8% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 931 resolved cases, 2023–2026

Examiner Intelligence

TIEU, BINH KIEN View full profile →

Grants 87% — above average

Career Allow Rate

809 granted / 931 resolved

+24.9% vs TC avg

Moderate +10% lift

Without

With

+9.8%

Interview Lift

resolved cases with interview

Typical timeline

2y 5m

Avg Prosecution

25 currently pending

Career history

956

Total Applications

across all art units

Statute-Specific Performance

§101

6.1%

-33.9% vs TC avg

§103

43.9%

+3.9% vs TC avg

§102

26.5%

-13.5% vs TC avg

§112

4.1%

-35.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 931 resolved cases

Office Action

§102 §103

DETAILED ACTION

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-3, 6, 8-10, 13, 15-18 and 21 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Lee et al. (US 11,620,780).
 	Regarding claim 1, Lee et al. (hereinafter “Lee”) teaches a system (i.e., a system comprising a computing system 200, a plurality of devices 201 associated with a user and a plurality of devices 203 associated with a second user; col.4, lines 7-23 and col.5, lines 46-56) for immersive virtual reality communication (i.e., virtual reality and mixed reality display devices being used to present immersive visual experiences in a video-based communication, e.g., mixed reality teleconference scenario between the user 100 and the second user 104, as shown in figure 1, in a an immersive manner; col.2, lines 9-26), the system comprising:
a first capture device configured to capture a stream of images of a first user (i.e., a first image device 110 and/or a first head mounted display device 108 images a face of the first user 100; col.3, lines 4-16 and col.4, lines 18-23); 
a first network configured to transmit the captured stream of images of the first user (i.e., the arrow lines for connecting the plurality of devices 201 to a communication network 214 as shown in figure 2, performed as a first network, are network connections used to transmit the image data to the remote computing system 200; col.3, lines 17-21; col.5, lines 57-60 and col.6, lines 11-16);
a second network configured to receive data based at least in part on the captured stream of images of the first user (i.e., the arrow lines for connecting the plurality of devices 203 to a communication network 214 as shown in figure 2, performed as a second network, are network connections used to receive the image data included the first avatar 112 to one of the plurality of devices 203, such as a second head mounted display device 118 worn by the second user; col.3, lines 30-32; col.7, lines 47-58 and col.8, lines 57-66); and 
a first virtual reality device used by a second user, wherein the first virtual reality device is configured to render a virtual environment and to produce a rendition of the first user based at least in part on the data based at least in part on the stream of images of the first user produced by the first capture device (i.e., viewing the avatar of the first user in the mixed reality teleconference; col.3, lines 30-38; col.6, lines 20-32 and col.9, lines 10-37).
Regarding claim 2, Lee further teaches the system further comprising:
 a second capture device configured to capture a stream of images of the second user (i.e., an outward-facing image sensor of the second head-mounted display device 118 or a second imaging device 120, as shown in figure 1, images or captures  the second user’s face; col.3, lines 47-52); and 
 	a second virtual reality device used by the first user, wherein the second virtual reality device is configured to render a virtual environment and to produce a rendition of the second user based at least in part on data based at least in part on the captured stream of images of the second user produced by the second capture device (i.e., the second head-mounted display device 108 presented the second avatar 112 of the second user on the first head-mounted display device 108; col.3, lines 55-67).
	Regarding claim 3, Lee further teaches limitations of the claim, such as the immersive visual experience in the virtual reality and mixed reality display (as the virtual environment) presented to the first user and second user, as shown in figure 1 (col.3, lines 21-38 and lines 47-54). Lee further teaches a viewpoint, such as the first user moving out of a field of view of an image device while the second user may stay in his or her field of view of the second image device (col.11, line 45 through col.12, line 12). Also, Lee further teaches the first user may be positioned either close to or far from the image device 704 so that the viewpoint of the first user may is changed and different as shown in figures 7A and 7B (col.10, line 61 through col.11, line 44).
	Regarding claim 6, Lee further teaches the first user and second user used different type of imaging/display devices on the first and second networks (network connections) in the mixed reality teleconference. Depending on the capabilities of the display device, the remote computing device 200 comprises the machine learning model (performed as a graphical processing unit) to generate an audio/video stream 232 included the avatar 224 (col.7, lines 15-28). Lee further teaches the remote computing system 200 may receive and transmit the audio/video stream 232 included the avatar 224 in one of the forms, e.g., two-dimensional display, three-dimensional virtual reality display, etc., from and to each of the head-mounted display devices, smartphone, tablet, etc. via the first and second networks (col.9, lines 10-37). 
Regarding claim 8, Lee teaches a method for immersive virtual reality communication, the method comprising:
capturing a stream of images of a first user (i.e., a first image device 110 and/or a first head mounted display device 108 images a face of the first user 100; col.3, lines 4-16 and col.4, lines 18-23); 
transmitting the captured stream of images of the first user (i.e., the arrow lines for connecting the plurality of devices 201 to a communication network 214 as shown in figure 2, performed as a first network, are network connections used to transmit the image data to the remote computing system 200; col.3, lines 17-21; col.5, lines 57-60 and col.6, lines 11-16);
receiving data based at least in part on the captured stream of images of the first user (i.e., the arrow lines for connecting the plurality of devices 203 to a communication network 214 as shown in figure 2, performed as a second network, are network connections used to receive the image data included the first avatar 112 to one of the plurality of devices 203, such as a second head mounted display device 118 worn by the second user; col.3, lines 30-32; col.7, lines 47-58 and col.8, lines 57-66); and 
rendering a virtual environment and to produce a rendition of the first user based at least in part on the data based at least in part on the stream of images of the first user produced by the first capture device (i.e., viewing the avatar of the first user in the mixed reality teleconference; col.3, lines 30-38; col.6, lines 20-32 and col.9, lines 10-37).
Regarding claim 9, Lee further teaches the method further comprising:
 capturing a stream of images of the second user (i.e., an outward-facing image sensor of the second head-mounted display device 118 or a second imaging device 120, as shown in figure 1, images or captures  the second user’s face; col.3, lines 47-52); and 
 	rendering a virtual environment and to produce a rendition of the second user based at least in part on data based at least in part on the captured stream of images of the second user produced by the second capture device (i.e., the second head-mounted display device 108 presented the second avatar 112 of the second user on the first head-mounted display device 108; col.3, lines 55-67).
	Regarding claim 10, Lee further teaches limitations of the claim, such as the immersive visual experience in the virtual reality and mixed reality display (as the virtual environment) presented to the first user and second user, as shown in figure 1 (col.3, lines 21-38 and lines 47-54). Lee further teaches a viewpoint, such as the first user moving out of a field of view of an image device while the second user may stay in his or her field of view of the second image device (col.11, line 45 through col.12, line 12). Also, Lee further teaches the first user may be positioned either close to or far from the image device 704 so that the viewpoint of the first user may is changed and different as shown in figures 7A and 7B (col.10, line 61 through col.11, line 44).
	Regarding claim 13, Lee further teaches the first user and second user used different type of imaging/display devices on the first and second networks (network connections) in the mixed reality teleconference. Depending on the capabilities of the display device, the remote computing device 200 comprises the machine learning model (performed as a graphical processing unit) to generate an audio/video stream 232 included the avatar 224 (col.7, lines 15-28). Lee further teaches the remote computing system 200 may receive and transmit the audio/video stream 232 included the avatar 224 in one of the forms, e.g., two-dimensional display, three-dimensional virtual reality display, etc., from and to each of the head-mounted display devices, smartphone, tablet, etc. via the first and second networks (col.9, lines 10-37). 
Regarding claim 15, Lee teaches a virtual reality apparatus for immersive communication, the apparatus comprising:
a storage unit further comprising:
a capture module configured to capture a stream of images of a first user (i.e., a first image device 110 and/or a first head mounted display device 108 images or captures a face of the first user 100; col.3, lines 4-16 and col.4, lines 18-23); 
a communication module configured to transmit the captured stream of images of the first user to a network (i.e., the arrow lines for connecting the plurality of devices 201 to a communication network 214 as shown in figure 2, performed as a communication module, are used to transmit the image data to the remote computing system 200; col.3, lines 17-21; col.5, lines 57-60 and col.6, lines 11-16);
a rendering module configured to render a virtual environment (i.e., the avatar 224 of the first user in the audio/video stream 232 is visually presented on a head-mounted display device of the second user to view a mixed reality teleconference as a virtual environment; col.7, lines 47-62; col.9, lines 11-19 and col.10, lines 24-36); 
a rendition module configured to produce a rendition of the first user based at least in part on the data based at least in part on the stream of images of the first user produced by the capture module (i.e., viewing the avatar of the first user in the mixed reality teleconference; col.3, lines 30-38; col.6, lines 20-32 and col.9, lines 10-37); and
a position module configured to direct the user based on position optimization based on at least one of a user pose, a user position, a user scale, or a virtual reality device boundary (i.e., the head-mounted display device or hand-held display device to determine or direct motion of an avatar in a video stream, such as a position of a table and first user based on position optimization, as shown in figures 6A and 6B (col.10, lines 37-60); also, as shown in figures 7A and 7B, the machine-learning model 216, performed as a position module, to determine or direct a user moving beyond a threshold distance from an imaging device during a mixed reality teleconference; col10, line 61 through col.11, line 44).
Regarding claim 16, Lee teaches a computer-readable storage device having computer executable instructions stored therein, said instructions causing a computing device to execute a method for immersive virtual reality communication, comprising:
capturing a stream of images of a first user (i.e., a first image device 110 and/or a first head mounted display device 108 images a face of the first user 100; col.3, lines 4-16 and col.4, lines 18-23); 
transmitting the captured stream of images of the first user (i.e., the arrow lines for connecting the plurality of devices 201 to a communication network 214 as shown in figure 2, performed as a first network, are network connections used to transmit the image data to the remote computing system 200; col.3, lines 17-21; col.5, lines 57-60 and col.6, lines 11-16);
receiving data based at least in part on the captured stream of images of the first user (i.e., the arrow lines for connecting the plurality of devices 203 to a communication network 214 as shown in figure 2, performed as a second network, are network connections used to receive the image data included the first avatar 112 to one of the plurality of devices 203, such as a second head mounted display device 118 worn by the second user; col.3, lines 30-32; col.7, lines 47-58 and col.8, lines 57-66); and 
rendering a virtual environment and to produce a rendition of the first user based at least in part on the data based at least in part on the stream of images of the first user produced by the first capture device (i.e., viewing the avatar of the first user in the mixed reality teleconference; col.3, lines 30-38; col.6, lines 20-32 and col.9, lines 10-37).
Regarding claim 17, Lee teaches the limitations further comprising:
 capturing a stream of images of the second user (i.e., an outward-facing image sensor of the second head-mounted display device 118 or a second imaging device 120, as shown in figure 1, images or captures  the second user’s face; col.3, lines 47-52); and 
 	rendering a virtual environment and to produce a rendition of the second user based at least in part on data based at least in part on the captured stream of images of the second user produced by the second capture device (i.e., the second head-mounted display device 108 presented the second avatar 112 of the second user on the first head-mounted display device 108; col.3, lines 55-67).
Regarding claim 18, Lee further teaches limitations of the claim, such as the immersive visual experience in the virtual reality and mixed reality display (as the virtual environment) presented to the first user and second user, as shown in figure 1 (col.3, lines 21-38 and lines 47-54). Lee further teaches a viewpoint, such as the first user moving out of a field of view of an image device while the second user may stay in his or her field of view of the second image device (col.11, line 45 through col.12, line 12). Also, Lee further teaches the first user may be positioned either close to or far from the image device 704 so that the viewpoint of the first user may is changed and different as shown in figures 7A and 7B (col.10, line 61 through col.11, line 44).
Regarding claim 21, Lee further teaches the first user and second user used different type of imaging/display devices on the first and second networks (network connections) in the mixed reality teleconference. Depending on the capabilities of the display device, the remote computing device 200 comprises the machine learning model (performed as a graphical processing unit) to generate an audio/video stream 232 included the avatar 224 (col.7, lines 15-28). Lee further teaches the remote computing system 200 may receive and transmit the audio/video stream 232 included the avatar 224 in one of the forms, e.g., two-dimensional display, three-dimensional virtual reality display, etc., from and to each of the head-mounted display devices, smartphone, tablet, etc. via the first and second networks (col.9, lines 10-37). 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 4-5, 11-12 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Lee et al. (US 11,620,780) in view of Katz et al. (US 2015/0193949).
Regarding claims 4-5, 11-12 and 19-20, Lee teaches all subject matters as claimed above, except for the features of directing the first user and the second user, prior to the first user and the second user renditions being generated via a user interface in the respective second virtual reality device and first virtual reality device, to move and turn to optimize a position of the first user and a position of the second user relative to the first capture device and the second capture device respectively based on a desired rendering environment. However, Katz et al. (hereinafter “Katz”) teaches such features including a tracking module to prompt a user to look-up, down, left, right, or other designated directions to calibrate the system environment to maintain accurate tracking of the VR headset in paragraph [0034].
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the features of directing the first user and the second user, prior to the first user and the second user renditions being generated via a user interface in the respective second virtual reality device and first virtual reality device, to move and turn to optimize a position of the first user and a position of the second user relative to the first capture device and the second capture device respectively based on a desired rendering environment, as taught by Katz, into view of Lee in order to calibrate the system environment to maintain accurate display on the users’ devices.

Claims 7, 14 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Lee et al. (US 11,620,780) in view of Korman (US 2023/0231982) or Khalid et al. (US 2019/0156565).
Regarding claim 7, Lee further teaches an outward-facing image sensor comprising one or more cameras that images the first physical scene 102. Lee further teaches the first imaging device 110 to image or capture the face of the first user 100 (col.3, lines 4-16). Lee failed to clearly teach that the outward-facing image sensor of the first head-mounted display device 108 or the first imaging device 110 is a stereoscopic. However, Korman teaches 3-dimensional (3D) VR media comprising video captured using a stereoscopic camera (para. [0023]-[0024]). Khalid et al. (hereinafter “Khalid”) teaches stereoscopic versions of the field of view that may be captured by one or more stereoscopic camera (para. [0070]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the features of the first capture device is a stereoscopic camera, as taught by either Korman or Khalid, into view of Lee in order to present the user with the 3D VR media or video on the user’s head-mounted display device.
Regarding claims 14 and 22, Korman also teaches limitations of the claims in paragraphs [0023]-[0024]. Khalid also teaches limitations of the claims in paragraphs [0070]. 








Any inquiry concerning this communication or earlier communications from the examiner should be directed to BINH TIEU whose telephone number is (571)272-7510. The examiner can normally be reached on 9-5. The Examiner’s fax number is (571) 273-7510 and E-mail address: BINH.TIEU@USPTO.GOV.
Examiner interviews are available via telephone or video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, FAN S. TSANG can be reached on (571) 272-7547.
Any response to this action should be mailed or handed carry deliveries to:
Commissioner of Patents and Trademarks 
401 Dulany Street 
Alexandria, VA 22314
Or    faxed to: (571) 273-8300
 	Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (FAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the FAIR system, see fitp://nair-direct.usoto.aqev. If you have any questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/Binh Kien Tieu/Primary Examiner, Art Unit 2694                                                                                                                                                                                                        
Date:   January 2026

Read full office action

Prosecution Timeline

Jun 25, 2024

Application Filed

Jan 28, 2026

Non-Final Rejection — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/526,929

Patent 12603111

AUDIO GUESTBOOK SYSTEMS AND METHODS

2y 5m to grant Granted Apr 14, 2026

18/472,974

Patent 12598223

Dynamic Teleconference Content Item Distribution to Multiple Devices Associated with a User

2y 5m to grant Granted Apr 07, 2026

17/878,004

Patent 12592994

REAL-TIME USER SCREENING OF MESSAGES WITHIN A COMMUNICATION PLATFORM

2y 5m to grant Granted Mar 31, 2026

18/494,991

Patent 12592740

WIRELESS COMMUNICATION DEVICE AND WIRELESS COMMUNICATION METHOD

2y 5m to grant Granted Mar 31, 2026

18/281,213

Patent 12573198

COMMUNICATION SYSTEM, OUTPUT DEVICE, COMMUNICATION METHOD, OUTPUT METHOD, AND OUTPUT PROGRAM

2y 5m to grant Granted Mar 10, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

87%

Grant Probability

97%

With Interview (+9.8%)

2y 5m

Median Time to Grant

Low

PTA Risk

Based on 931 resolved cases by this examiner. Grant probability derived from career allow rate.