Office Action Analysis: 18785023 — SYSTEM FOR RENDERING TWO-DIMENSIONAL CONTENT IN A THREE-DIMENSIONAL VIRTUAL REALITY ENVIRONMENT, AND A METHOD THEREOF

Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Foreign Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No GB2312618.8, filed on 08/18/2023.

Claim Rejection – 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1, 3, 5, 11, 13, and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Rimon (US 2016/0093105 A1) hereinafter referenced as Rimon in view of Salmani (US 11335077 B1) hereinafter referenced as Salmani.
Regarding claim 1, 
Rimon teaches the following:
A system for rendering two-dimensional, 2D, content in a three-dimensional, 3D, virtual reality environment
"Figure 10 illustrates a system for rendering text information on a head-mounted display to a user, in accordance with an embodiment of the invention." (Rimon, [41])
"rendering a view of a virtual environment to the head-mounted display…receiving text information for rendering on the head-mounted display; presenting the text information in the virtual environment in a vicinity of the gaze target." (Rimon, [11])
“Rendering logic typically includes processing stages configured for determining the three-dimensional spatial relationships between objects and/or for applying appropriate textures, etc., based on the game state and viewpoint.” (Rimon, [148])
Rimon teaches a system for rendering text information in a three-dimensional virtual environment.
comprising: receiving circuitry configured to receive the 2D content, the 2D content being in a 2D format; 
"receiving text information for rendering on the head-mounted display;" (Rimon, [11])
"rendering of text from any source of text information. Merely for purposes of illustration, some representative examples of sources of text information and/or types of text information." (Rimon, [67])
"mobile device 1008 can be configured to receive various types of text information … The mobile device 1008 …described with reference to FIG. 10." (Rimon, [118])
"FIG. 10 illustrates a system for rendering text information on a head-mounted display to a user, in accordance with an embodiment of the invention." (Rimon, [43])
"FIG. 1 illustrates a system for interactive gameplay of a video game, in accordance with an embodiment of the invention. A user 100 is shown wearing a head-mounted display (HMD) 102."  (Rimon, [50])
Rimon teaches the mobile device configured to receive the text information, the text information being from any source of text information. The mobile device is a part of FIG 10 which is the system for the HMD.
environment generating circuitry configured to generate the 3D virtual reality environment, wherein the 3D virtual reality environment comprises a virtual surface upon which the 2D content is to be rendered; 
"the computing device 1000 for rendering to the HMD device 1004" (Rimon, [115])
"rendering a view of a virtual environment to the head-mounted display" (Rimon, [11])
"text information can be displayed on any objects or surfaces that exist within a virtual environment." (Rimon, [68])
“Rendering logic typically includes processing stages configured for determining the three-dimensional spatial relationships between objects and/or for applying appropriate textures, etc., based on the game state and viewpoint.” (Rimon, [148])
Rimon teaches the computing device configured to render the virtual environment to the head-mounted display, wherein the virtual reality environment comprises objects or surfaces that exist within a virtual environment upon which the text information is to be rendered.
and rendering circuitry configured to render the 3D virtual reality environment for display at a head mounted display, HMD, wherein the rendering circuitry is configured to render the 2D content on the virtual surface
"the computing device 1000 for rendering to the HMD device 1004" (Rimon, [115]) 
"rendering a view of a virtual environment to the head-mounted display" (Rimon, [11])
"text information can be displayed on any objects or surfaces that exist within a virtual environment." (Rimon, [68])
Rimon teaches a computer device configured to render the virtual environment to the head-mounted display, HMD, wherein the rendering circuitry is configured text information to be displayed on any objects or surfaces that exist within a virtual environment.
upscale the at least one region within the 3D virtual reality environment indicated in the mask.
"if a user's eyes are determined to be looking in a specific direction, then the video rendering for that direction can be prioritized or emphasized, such as by providing greater detail or faster updates in the region where the user is looking. It should be appreciated that the gaze direction of the user can be defined … relative to a virtual environment that is being rendered on the head-mounted display." (Rimon, [64])
“Rendering logic typically includes processing stages configured for determining the three-dimensional spatial relationships between objects and/or for applying appropriate textures, etc., based on the game state and viewpoint.” (Rimon, [148])
Rimon teaches providing greater detail or faster updates in the region in a virtual environment.
	However, Rimon fails to teach the following: recognition circuitry configured to recognise one or more regions of interest in the 2D content; mask generating circuitry configured to generate, in dependence upon a location of the virtual surface within the generated 3D virtual reality environment and in dependence upon at least one recognised region of interest in the 2D content, a mask of the generated 3D virtual reality environment, wherein the mask indicates at least one region within the 3D virtual reality environment in which the at least one recognised region of interest is to be rendered;
	Salmani does. Salmani teaches the following:
recognition circuitry configured to recognise one or more regions of interest in the 2D content; 
"In particular embodiments the computing system may utilize a machine learning model, comprising one or more neural networks, to detect objects of interest [32]…The machine-learning model is configured to extract features of the image, for example object of interest [33]" (Salmani, [32-33])
Salmani teaches computer system configured to detect one or more objects of interest from images.
mask generating circuitry configured to generate, in dependence upon a location of the virtual surface within the generated 3D virtual reality environment and in dependence upon at least one recognised region of interest in the 2D content, a mask of the generated 3D virtual reality environment, wherein the mask indicates at least one region within the 3D virtual reality environment in which the at least one recognised region of interest is to be rendered, 
"the embodiments disclosed describe generating two-dimensional bounding boxes, segmentation masks, and 2.5D surfaces to represent detected object of interests using the image data… the computing system to detect and represent one or more object of interests 210 [41] …The 3D mesh is projected onto a 2D image plane of a camera positioned at the viewpoint of the user, resulting in the triangles of the 3D mesh being projected to a flat 2D plane. The projected triangles of the 3D mesh could then be compared to the 2D segmentation mask …to determine where the object of interest 200 is located. In particular embodiments this is done by determining which triangles belong to the corresponding detected object of interest and which belong to other real objects or components of the real environment. In this way, detected object of interests 210 can be posed as a 3D mesh in the 3D environment [42]" (Salmani, [41, 42])
Salmani teaches a mask generating circuity configured to generate, in dependence upon a projection of a 3D mesh onto a 2D image plane within a generated 3D virtual reality environment and in dependence upon at least one detected object of interest represented by a 2D segmentation mask, a mask of the generated 3D virtual reality environment, wherein the mask indicates at one region with the 3D virtual reality environment in which the detected object of interest is to be posed as a 3D mesh, based on determining which triangles of the 3D mesh correspond to the 2D segmentation mask and which belong to other components of the environment. The computer system generates an artificial reality environment. Within the environment there are 3D mesh projected onto 2D surfaces. These projections are then compared to 2D segmentation masks of object of interests to locate where the objects of interests are. The triangles which are alike the 2D segmentation masks are determined to be the object of interest and posed in the environment. Triangles which are determined to not be alike the 2D segmentation masks are not regions of interest and are therefore the environment.
Rimon BASE is analogous art with respect to Salmani because they are from the same field of endeavor, namely head mounted display image rendering. Before the effective filling date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Salmani to create a system for rendering text information in a virtual environment to the head-mounted display comprising: a mobile device configured to receive the text information, the text information being from any source of text information; a computing device configured to render the virtual environment to the head-mounted display, wherein the virtual reality environment comprises objects or surfaces that exist within a virtual environment upon which the text information is to be rendered; a computer system configured to detect one or more objects of interest from images; a mask generating circuity configured to generate, in dependence upon a projection of a 3D mesh onto a 2D image plane within a generated 3D virtual reality environment and in dependence upon at least one detected object of interest represented by a 2D segmentation mask, a mask of the generated 3D virtual reality environment, wherein the mask indicates at one region with the 3D virtual reality environment in which the detected object of interest is to be posed as a 3D mesh, based on determining which triangles of the 3D mesh correspond to the 2D segmentation mask and which belong to other components of the environment; the computer device configured to render the virtual environment to the head-mounted display, HMD, wherein the rendering circuitry is configured text information to be displayed on any objects or surfaces that exist within a virtual environment and triangles corresponding to detected object of interest and corresponding to other real objects or components of the real environment, and providing greater detail or faster updates in the region in a virtual environment. A person of ordinary skill in the art would do so in order to improve the immersive and visual experience for the user.

Regarding claim 3, 
Rimon in view of Salmani teaches the system of claim 1, and additionally teaches the following: 
Salmani teaches:
wherein the recognition circuitry is configured to recognise, in the 2D content as received by the receiving circuitry, one or more of the regions of interest in the 2D content.
"In particular embodiments the computing system may utilize a machine learning model, comprising one or more neural networks, to detect objects of interest [32] …The machine-learning model is configured to extract features of the image, for example object of interest 210 such as a human.[33]" (Salmani, [32-33]) 
Salmani teaches the computing system is configured to detect one or more objects of interest in the text information. It is inherited that in order to recognition content, content must first be received. 
	Before the effective filling date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Rimon in view of Salmani with the feature of Salmani to create a computing system configured to detect, in the various types of text information received by the mobile device, one or more objects of interest in the text information.
Regarding claim 5, 
Rimon in view of Salmani teaches the system of claim 1, and additionally teaches the following: 
Salmani teaches:
the mask generating circuitry is configured to: generate a precursory mask in dependence upon at the least one recognised region of interest in the 2D content, wherein the precursory mask indicates at least one region within the 2D content in which the at least one recognised region of interest is located; 
"the embodiments disclosed describe generating two-dimensional bounding boxes, segmentation masks, and 2.5D surfaces to represent detected object of interests using the image data… the computing system to detect and represent one or more object of interests” (Salmani, [41])
“The projected triangles of the 3D mesh could then be compared to the 2D segmentation mask, using for example, a machine learning model, to determine where the object of interest 200 is located.” (Salmani, [42])
Salmani teaches the mask generating circuitry is configured to generate a 2D segmentation mask in dependence upon detected objects of interest within the segmentation mask indicated at least one region within the 2D by being used to determined where the object of interest is located.
map the precursory mask onto the virtual surface within the generated 3D virtual environment; 
“The 3D mesh is projected onto a 2D image plane of a camera positioned at the viewpoint of the user, resulting in the triangles of the 3D mesh being projected to a flat 2D plane. The projected triangles of the 3D mesh could then be compared to the 2D segmentation mask” (Salmani, [42])
Salmani teaches a 3D mesh is projected onto a 2D image plane and the projected triangles are compared to the 2D segmentation mask, thereby mapping the segmentation mask onto the projected 2D surface within the generated 3D virtual environment.
and generate the mask of the generated 3D virtual reality environment in dependence upon the location of the virtual surface within the generated 3D virtual reality environment and in dependence upon the mapped precursory mask.
"The projected triangles of the 3D mesh could then be compared to the 2D segmentation mask…to determine where the object of interest 200 is located. In particular embodiments this is done by determining which triangles belong to the corresponding detected object of interest and which belong to other real objects or components of the real environment. In this way, detected object of interests 210 can be posed as a 3D mesh in the 3D environment [42]" (Salmani, [ 42])
Salmani teaches determining which project mesh triangles correspond to the segmentation mask (mapped to the precursory mask) and which do not. Thereby generating a mask of the 3D environment dependent upon the location of the virtual surface and the mapped segmentation mask, for the classification of the triangles is based on their correspondence and relativeness to the mask, posed as a 3D mesh in the 3D environment which reads as the masked regions in the 3D space. A 3D mesh is projected onto a 2D image plane and the projected triangles are compared to the 2S segmentation mask, thereby mapping the segmentation mask onto the projected 2D surface within the generated 3D virtual environment.
	Rimon teaches the following:
2D content.
"rendering of text from any source of text information. Merely for purposes of illustration, some representative examples of sources of text information and/or types of text information." (Rimon, [67])
Rimon teaches 2D content.
Before the effective filling date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Rimon in view of Salmani with the above mentioned features of Rimon and Salmi to create a mask generating circuitry configured to generate a 2D segmentation mask in dependence upon detected 2D content of interest within the segmentation mask indicated at least one region within the 2D by being used to determined where the object of interest is located; determining which project mesh triangles correspond to the segmentation mask (mapped to the precursory mask) and which do not. Thereby generating a mask of the 3D environment dependent upon the location of the virtual surface and the mapped segmentation mask, for the classification of the triangles is based on their correspondence and relativeness to the mask, posed as a 3D mesh in the 3D environment which reads as the masked regions in the 3D space. A 3D mesh is projected onto a 2D image plane and the projected triangles are compared to the 2D segmentation mask, thereby mapping the segmentation mask onto the projected 2D surface within the generated 3D virtual environment.
Claim 11 is rejected using the same rationale or bases as applied to claim 1.
Claim 13 is rejected using the same rationale or bases as applied to claim 3.
Claim 14 is rejected using the same rationale or bases as applied to claim 1 and the mentioned structure.
Additionally, claim 14 recites the following structure:
A non-transitory, computer readable storage medium containing a computer program comprising computer executable instructions that when executed by a computer system
"With reference to FIG. 12, a diagram illustrating components of a head-mounted display 102 is shown, in accordance with an embodiment of the invention. The head-mounted display 102 includes a processor 1300 for executing program instructions. A memory 1302 is provided for storage purposes, and may include both volatile and non-volatile memory [126] … Processor 1450 is configured to execute logic, e.g. software, included within the various components of Video Server System 1420 discussed herein [153]  … Storage 1455 includes non-transitory analog and/or digital storage devices [154]… The apparatus can be specially constructed for the required purpose, or the apparatus can be a general-purpose computer selectively activated or configured by a computer program stored in the computer. In particular, various general-purpose machines can be used with computer programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations [160]" (Rimon, [126, 153, 154, 160]
Rimon teaches non-transitory analog and/or digital storage devices containing computer programs comprising computer executable instructions that when executed by processor.


Claim(s) 2, 4, 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Rimon in view of Salmani, and Lukas (WO 2023081573 A1) hereinafter referenced as Lukas.
Regarding claim 2, 
Rimon in view of Salmani teach the system of claim 1, and additionally teach the following: 
Rimon teaches:
the receiving circuitry is configured to receive viewpoint data indicating a location and/or orientation of an HMD within a real-world environment
"when considered in combination with the tracked location and orientation of the HMD 102 , a real-world gaze direction of the user can be determined, as the location and orientation of the HMD 102 is synonymous with the location and orientation of the user's head. That is, the real-world gaze direction of the user can be determined from tracking the positional movements of the user's eyes and tracking the location and orientation of the HMD 102 ."(Rimon, [64])
"Figure 10 illustrates a system for rendering text information on a head- mounted display to a user, in accordance with an embodiment of the invention." (Rimon, [43])
Rimon teaches the HMD is configured to receive real-world gaze direction indicating the location and orientation of the HMD 102.
the recognition circuitry is configured to: render, in dependence upon the received viewpoint data
"It should be appreciated that the gaze direction of the user can be defined … relative to a virtual environment that is being rendered on the head-mounted display [63] … When a view of a virtual environment is rendered on the HMD 102 , the real-world gaze direction of the user can be applied to determine a virtual world gaze direction of the user in the virtual environment. [64]" (Rimon, [63, 64])
Rimon teaches the HMD is configured to: render, in dependence upon the received real-world gaze direction. 
the 3D candidate environment comprises at least: the virtual surface of the generated 3D virtual reality environment, and the 2D content, wherein the 2D content is rendered on the virtual surface
"text information can be displayed on any objects or surfaces that exist within a virtual environment. The display of text information in this manner can provide for rendering of text in a manner that is integrated into the context of the virtual environment. " (Rimon, [68]) 
Rimon teaches the virtual environment comprises at least: objects or surfaces that exist within a virtual environment, and text information, wherein the text information is displayed on the objects or surfaces and rendered in a manner integrated into the context of the virtual environment. 
	Salmani teaches:
recognise one or more of the regions of interest comprised within the 2D content rendered on the virtual surface
"At a high-level, a computing system associated with an artificial reality system may receive image data of an environment, and detect one or more objects of interest in the artificial reality environment." (Salmani, [04])
Salmani teaches how to detect an object of interest.
Rimon in view of Salmani fail to teach the following: a stereoscopic view of a 3D candidate environment for display at the HMD; a left and right eye view for a left and right eye display screen of the HMD.
	But, Lukas does. Lukas teaches the following:
a stereoscopic view of a 3D candidate environment for display at the HMD, 
"The environment may be a real-world environment, a virtual environment [56] … the head-mounted apparatus 252 include stereoscopic display(s) that display content to each of the eyes of the head-mounted apparatus 252 [62] … The output images may provide a stereoscopic view of the environment, in some cases with the virtual content overlaid and/or with other modifications. " (Lukas [56, 62, 69])
Lukas teaches a stereoscopic view of environment for display at the HMD
wherein the stereoscopic view comprises a left eye view for a left eye display screen of the HMD and a right eye view for a right eye display screen of the HMD
"The output images may provide a stereoscopic view of the environment…the HMD 310 can display a first display image to the user 320’s right eye, the first display image based on an image captured by the first camera 33OA. The HMD 310 can display a second display image to the user 320’ s left eye, the second display image based on an image captured by the second camera 33OB." (Lukas [69])
Lukas teaches the stereoscopic view comprises a second display image to the user’s left eye of the HMD and a first display image to the user’s right eye of the HMD.
Lukas is analogous art with respect to Rimon in view of Salmani because they are from the same field of endeavor, namely head mounted display image rendering. Before the effective filling date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Rimon in view of Salmani to configure the HMD to: render, in dependence upon the received real-world gaze direction, a stereoscopic view of environment for display at the HMD, wherein the stereoscopic view comprises a second display image to the user’s left eye of the HMD and a first display image to the user’s right eye of the HMD, wherein the environment comprises at least: the surfaces that exist within a virtual environment, and the text information, wherein the text information is rendered on the virtual surface for at least one of two displays, left eye and a right display, detect one or more of the object of interest  in a vicinity of the gaze target of the text information displayed on any objects or surfaces that exist within a virtual environment. A person of ordinary skill in the art would do such in order to improve the immersive and visual experience for the user.

Regarding claim 4, 
Rimon in view of Salmani and Lukas teaches the system of claim 1, and additionally teaches the following: 
Salmani teaches:
the recognition circuitry is configured to recognise a given region of interest 
"In particular embodiments the computing system may utilize a machine learning model, comprising one or more neural networks, to detect objects of interest [32] …The machine-learning model is configured to extract features of the image, for example object of interest 210 such as a human.[33]" (Salmani, [32-33]) 
Salmani teaches the computer system is configured to detect a given objects of interest by detecting one or more selected.
Lukas teaches the following:
given region of interest by recognizing one or more selected from the list consisting of: i. a collection of one or more alphanumeric characters; ii. a user interface, UI, element; iii. a face of a character; iv. metadata defining a location within the 2D content at which optional text is to be displayed; and v. a predetermined object.
“The element can include, for example, one or more images, one or more videos, one or more strings of characters (e.g., alphanumeric characters, numbers, text, Unicode characters, symbols, and/or icons)” (Lukas, [56])
Before the effective filling date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify, Rimon in view of Salmani to configure the computer system to detect a given objects of interest by detecting one or more selected from the list consisting of: one or more strings of characters (e.g., alphanumeric characters, numbers, text, Unicode characters, symbols, and/or icons).

Claim 12 is rejected using the same rationale or bases as applied to claim 2.


Claim(s) 6 and 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Rimon in view of Salmani, and MATLAB (Specify ROI as Binary Mask) hereinafter referenced as MATLAB.
Regarding claim 6, 
Rimon in view of Salmani teaches the system of claim 1, and additionally teaches the following: 
Salmani teaches:
the mask generating circuitry is configured to assign and render
"the embodiments disclosed describe generating two-dimensional bounding boxes, segmentation masks, and 2.5D surfaces to represent detected object of interests using the image data…the computing system to detect and represent one or more object of interests 210" (Salmani, [41])
“the computing system can determine and assign a depth to the surface representing the object of interest” " (Salmani, [39])
Salmani teaches the computer system (mask generating circuitry) is configured to assign a value to an object of interest and can render segmentation masks to represent detected object of interests.
However, Rimon in view of Salmani fail to teach the following: the mask comprises a plurality of pixels; a value of zero to a given pixel if the given pixel is not comprised within the at least one recognised region of interest, or assign one or more non-zero values to the given pixel if the given pixel is comprised within the at least one recognised region of interest, thereby indicating the at least one region within the 3D virtual reality environment in which the at least one recognised region of interest.
But MATLAB does. MATLAB teaches the following:
the mask comprises a plurality of pixels; a value of zero to a given pixel if the given pixel is not comprised within the at least one recognised region of interest, or assign one or more non-zero values to the given pixel if the given pixel is comprised within the at least one recognised region of interest, thereby indicating the at least one region within the 3D virtual reality environment in which the at least one recognised region of interest
"A binary mask defines a region of interest (ROI) of an image. Mask pixel values of 1 indicate image pixels that belong to the ROI. Mask pixel values of 0 indicate image pixels that are part of the background. Depending on the application, an ROI can consist of contiguous or discontiguous groups of pixels." (MATLAB Specify ROI as Binary Mask)
MATLAB teaches a mask comprises a plurality of pixel values. A value of zero to a given image pixels if the given pixel are part of the background, or assign one to the given pixel if the given pixel is belong to the ROI, thereby indicating the at least one region within the 3D virtual reality environment in which the at least one recognised region of interest.
MATLAB is analogous art with respect to Rimon in view of Salmani because they are from the same field of endeavor, namely image processing. Before the effective filling date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Rimon in view of Salmani  to create a 3D mask comprising of a plurality of pixel values; and the computer system is configured to assign a value of zero to a given pixel if the pixel is a part of the background or assign one or more non-zero values indicating image pixels belong to the ROI; if a mask pixel has a value of 1 indicating it belong to the ROI and is comprised within the at least one detected object of interest, thereby indicating the at least one region within the 3D virtual reality environment in which the at least one detected object of interests is to be rendered. A person of ordinary skill in the art would do such in order to improve the immersive and visual experience for the user.

Regarding claim 7, 
Rimon in view of Salmani and MATLAB teaches the system of claim 6, and additionally teaches the following: 
Rimon teaches:
the receiving circuitry is configured to receive viewpoint data indicating a location and/or orientation of an HMD within a real-world environment; 
"when considered in combination with the tracked location and orientation of the HMD 102 , a real-world gaze direction of the user can be determined, as the location and orientation of the HMD 102 is synonymous with the location and orientation of the user's head. That is, the real-world gaze direction of the user can be determined from tracking the positional movements of the user's eyes and tracking the location and orientation of the HMD 102."(Rimon, [64])
Rimon teaches the HMD is configured to receive real-world gaze direction indicating the location and orientation of the HMD
Salmani teaches the following:
if the given pixel is comprised within the at least one recognised region of interest, the mask generating circuitry is configured to assign, in dependence upon the received viewpoint data, a HMD proximity value indicating a virtual distance between a location of the HMD within the generated 3D virtual reality environment and a part of the virtual surface that is covered by the given pixel
"the computing system may utilize a ray-casting or other rendering process, such as ray tracing, for determining visual information and location information of one or more virtual objects that are to be displayed within a view of an artificial reality environment...used to determine a visibility of a virtual object 410 relative to the object of interest 210 by comparing a model of the virtual object with the surface. The ray-casting process may ultimately be used to associate pixels of the screen with points of intersection on any objects that would be visible for a view of an artificial reality environment [43]…the computing system may receive one or more depth measurements of the real environment corresponding to the portion of the image comprising the object of interest [38] …The computing system may further determine and assign a depth value to the surface representing the detected object of interest in the image. Depth may represent the distance from the HMD worn by the user to the detected object…Assigning one depth value conserves computing resources for posing and rendering the surface in the artificial reality environment without depreciable loss in the immersive-ness of the artificial reality environment. [59]" (Salmani, [38, 43, 59])
Salmani teaches that a given pixel is comprised within at least one recognised region of interest using ray-casting process to associates pixels of the screen with points of intersection on the surface of the detected object of interest such that pixels falling on that surface are comprised within the object of interest region. It is only those pixels falling on that surface that are brought within the scope of the subsequent assignment step. Pixels must be identified before assignment. The mask generating circuity computing system determines and assigns a depth value to the surface representing the detected object of interest. The depth value is determined using depth measurement of the real environment corresponding to the portion of image comprising the object of interest, thereby being dependent upon the viewpoint. The assigned depth value represents a distance reading on a proximity value which represents the distance from the HMD worn by the user and the HMD location within the artificial reality environment. This distance is measured to the detected object, whose surface is the surface with which the pixel is associated via the ray-casting process.
Before the effective filling date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Rimon in view of Salmani and MATLAB to create a HMD configured to receive real-world gaze direction indicating a the location and orientation of the HMD; and with the determined associated pixels of intersections for objects of interest, the computing system is configured to assign, dependence upon receiving one or more depth measurements of the real environment corresponding to the portion of the image comprising the object of interest, a depth value indicating a distance from the HMD worn by the user to the detected object and a surface representing the detected object of interest in the image.
Claim 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Rimon in view of the following: Salmani, MATLAB, and Agrawal (Image Processing in Python -The Computer Vision Techniques), hereinafter referenced as Agrawal.
Regarding claim 8, 
Rimon in view of Salmani and MATLAB teaches the system of claim 7, and additionally teaches the following: 
Rimon teaches:
the rendering circuitry is configured to upscale the least one region within the 3D virtual reality environment indicated in the mask in response to a value
"rendering a view of a virtual environment to the head-mounted display [11] … the computing device 1000 for rendering to the HMD device 1004" (Rimon, [11, 115])
"if a user's eyes are determined to be looking in a specific direction, then the video rendering for that direction can be prioritized or emphasized, such as by providing greater detail or faster updates in the region where the user is looking. It should be appreciated that the gaze direction of the user can be defined … relative to a virtual environment that is being rendered on the head-mounted display." (Rimon, [64])
Rimon teaches the computing device is configured to provide greater detail or faster updates in the region in a virtual environment in response to the user’s eyes.
	Salmani teaches the following:
		HMD proximity value
“The computing system may further determine and assign a depth value to the surface representing the detected object of interest in the image. Depth may represent the distance from the HMD worn by the user to the detected object.” (Salmani, [59])
Before the effective filling date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Rimon in view of Salmani and MATLAB to configure the computing device to provide greater detail or faster updates in the region in a virtual environment in response to the distance from the HMD worn by the user to the detected object.
However, Rimon in view of Salmani and MATLAB do not teach assigning a given pixel to become less than or equal to a threshold value.
Agrawal teaches the following:
the value assigned to the given pixel becoming less than or equal to a threshold value. 
“Now, thresholding can be defined as a process in which each pixel is converted to either 0 or 255 depending on whether its value is greater than or less than a threshold value. If the value of the pixel is greater than the threshold value, then it is converted to 255 otherwise it is converted to 1. This is how thresholding works.” (Agrawal)
Agrawal teaches the value assigned to the given pixel becoming less than a threshold value.
Agrawal is analogous art with respect to Rimon in view of Salmani and MATLAB because they are from the same field of endeavor, namely image processing. Before the effective filling date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Rimon in view of Salmani and MATLAB to configure the computing device to provide greater detail or faster updates in the region in a virtual environment in response to the distance from the HMD worn by the user corresponding to a given pixel value assigned to the given pixel becoming less than a threshold value. A person of ordinary skill in the art would do so in order to improve the immersive and visual experience for the user.
Claim 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Rimon in view of the following: Salmani, MATLAB, and Chiu (US 2023100689 A1), hereinafter referenced as Chiu.
Regarding claim 9, 
Rimon in view of Salmani, and MATLAB teaches the system of claim 6, and additionally teaches the following: 
Rimon teaches:
The receiving circuitry is configured to receive gaze data indicating a location within a display screen of the HMD at which a user thereof is gazing.
"In one embodiment, a gaze tracking camera 312 is included in the HMD 102 to enable tracking of the gaze of the user. The gaze tracking camera captures images of the user's eyes, which are analyzed to determine the gaze direction of the user. In one embodiment, information about the gaze direction of the user can be utilized to affect the video rendering… It should be appreciated that the gaze direction of the user can be defined relative to the head-mounted display, relative to a real environment in which the user is situated, and/or relative to a virtual environment that is being rendered on the head-mounted display. " (Rimon, [64])
Rimon teaches the HMD is configured to track the gaze of the user indicating a direction within a virtual environment of the HMD at which a user thereof is gazing.
Salmani teaches the following:
if the given pixel is comprised within the at least one recognised region of interest, the mask generating circuitry is configured to assign, in dependence upon the received data, a value
"the computing system may utilize a ray-casting or other rendering process, such as ray tracing, for determining visual information and location information of one or more virtual objects that are to be displayed within a view of an artificial reality environment...used to determine a visibility of a virtual object 410 relative to the object of interest 210 by comparing a model of the virtual object with the surface. The ray-casting process may ultimately be used to associate pixels of the screen with points of intersection on any objects that would be visible for a view of an artificial reality environment [43]…the computing system may receive one or more depth measurements of the real environment corresponding to the portion of the image comprising the object of interest [38] …The computing system may further determine and assign a depth value to the surface representing the detected object of interest in the image. Depth may represent the distance from the HMD worn by the user to the detected object…Assigning one depth value conserves computing resources for posing and rendering the surface in the artificial reality environment without depreciable loss in the immersive-ness of the artificial reality environment. [59]" (Salmani, [38, 43, 59])
Salmani teaches that the ray-casting process associates pixels of the screen with points of intersection on the surface of the detected object to interest. It is only those pixels falling on that surface that are brought within the scope of the subsequent assignment step. Pixels must be identified before assignment. The assignment step is scoped to those pixels which fall within the detected object of interest surface because assigning a value to pixels outside that surface would not add to the environment. The computing system, mask generating circuity, is configured to assign, in dependence upon receiving measurements data, a value to the surface of the detected object of interest which is the surface associated with the pixel the ray-casting process constituting the part of the virtual surface covered by the given pixel.   
Before the effective filling date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Rimon in view of Salmani and MATLAB to create a HMD configured to receive real-world gaze direction indicating a the location and orientation of the HMD; and determine the visual and location information and associated pixels of intersections for objects of interest, the computing system is configured to assign, dependence upon receiving one or more depth measurements of the real environment corresponding to the portion of the image comprising the object of interest, a depth value indicating a distance from the HMD worn by the user to the detected object and a surface representing the detected object of interest in the image.
However, Rimon in view of Salmani and MATLAB do not teach a gaze proximity value indicating a virtual distance between a location within the generated 3D virtual reality environment at which the user is gazing.
But, Chiu does. Chiu teaches the following:
a gaze proximity value indicating a virtual distance between a location within the generated 3D virtual reality environment at which the user is gazing and a part of the virtual surface that is covered by the given pixel
"In some embodiments, a location of the gaze of the user is used to determine a region in the environment in which the cursor can be positioned … In some embodiments, for the circular or spherical region, a boundary of the region is the threshold distance from the location of the gaze, such that the threshold distance is a radius of the region." (Chiu, [380])
Chiu teaches a threshold distance value defining a region within the 3D virtual reality environment, the region being centered at a location of the user’s gaze and bounded by a radius extending from the gaze location to portions of a virtual surface corresponding to a given pixel.
	Chiu is analogous art with respect to Rimon in view of Salmani and MATLAB because they are from the same field of endeavor, namely image processing moreover virtual and augmented reality. Before the effective filling date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Rimon in view of Salmani and MATLAB to configure a HMD to track the gaze of the user indicating a direction within a virtual environment of the HMD at which a user thereof is gazing; and with the determined associated pixels of intersections for objects of interest, the computing system is configured to assign, in dependence upon the received gaze data, a threshold distance value defining a region within the 3D virtual reality environment, the region being centered at a location of the user’s gaze and bounded by a radius extending from the gaze location to portions of a virtual surface corresponding to a given pixel. A person of ordinary skill in the art would do such in order to facilitate better user interaction within the three-dimensional environment.

Claim 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Rimon in view of the following: Salmani, MATLAB, and Chiu, and Agrawal.

Regarding claim 10, 
Rimon in view of the following: Salmani, MATLAB, and Chiu teaches the system of claim 9, and additionally teaches the following: 
Rimon teaches:
the rendering circuitry is configured to upscale the least one region within the 3D virtual reality environment indicated in the mask in response to the gaze proximity value 
"rendering a view of a virtual environment to the head-mounted display [11] … the computing device 1000 for rendering to the HMD device 1004" (Rimon, [11, 115])
"if a user's eyes are determined to be looking in a specific direction, then the video rendering for that direction can be prioritized or emphasized, such as by providing greater detail or faster updates in the region where the user is looking. It should be appreciated that the gaze direction of the user can be defined … relative to a virtual environment that is being rendered on the head-mounted display." (Rimon, [64])
Rimon teaches the computing device is configured to provide greater detail or faster updates in the region in a virtual environment in response to the user’s eyes.
Chiu teaches the following:
a gaze proximity value 
" In some embodiments, a location of the gaze of the user is used to determine a region in the environment in which the cursor can be positioned … In some embodiments, for the circular or spherical region, a boundary of the region is the threshold distance from the location of the gaze, such that the threshold distance is a radius of the region." (Chiu, [380])
Chiu teaches a threshold distance value defining a region within the 3D virtual reality environment, the region being centered at a location of the user’s gaze and bounded by a radius extending from the gaze location to portions of a virtual surface corresponding to a given pixel.
Agrawal teaches the following:
the value assigned to the given pixel becoming less than or equal to a threshold value. 
“Now, thresholding can be defined as a process in which each pixel is converted to either 0 or 255 depending on whether its value is greater than or less than a threshold value. If the value of the pixel is greater than the threshold value, then it is converted to 255 otherwise it is converted to 1. This is how thresholding works.” (Agrawal)
Agrawal teaches the value assigned to the given pixel becoming less than a threshold value.
Before the effective filling date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Rimon in view of the following: Salmani, MATLAB, Chiu, and Agrawal to configure the computing device to provide greater detail or faster updates in the region in a virtual environment in response to a threshold distance value defining a region within the 3D virtual reality environment, the region being centered at a location of the user’s gaze and bounded by a radius extending from the gaze location to portions of a virtual surface corresponding to a given pixel value assigned to the given pixel becoming less than a threshold value.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DUNE NGUYEN whose telephone number is (571)272-8919. The examiner can normally be reached M-TH 7:00AM - 5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Devona E Faulk can be reached at (571) 272-7515. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DUNE NGOC NGUYEN/Examiner, Art Unit 2618         


/DEVONA E FAULK/Supervisory Patent Examiner, Art Unit 2618
Read full office action
SYSTEM FOR RENDERING TWO-DIMENSIONAL CONTENT IN A THREE-DIMENSIONAL VIRTUAL REALITY ENVIRONMENT, AND A METHOD THEREOF

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

SYSTEM FOR RENDERING TWO-DIMENSIONAL CONTENT IN A THREE-DIMENSIONAL VIRTUAL REALITY ENVIRONMENT, AND A METHOD THEREOF

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email