DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 2-23 are pending in the instant application. Claims 2 and 9-16 are amended. Claim 1 is canceled and claims 22-23 are added.
Response to Arguments
Applicant' s arguments with respect to amended claims 2, 9 and 16 have been considered but are moot in view of the new ground(s) of rejection.
Claim Objections
Claims 22 is objected to because of the following informalities:
Claim 22, line 2, recite “an enhanced image quality”. Examiner suggest “the enhanced image quality”, since the phase previously appears in claim 2.
Appropriate correction is required.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 2-21 are rejected under 35 U.S.C. 103 as being unpatentable over Miyaki (US 20190250773 A1) in view of Duanmu et al. (US 20220377304 A1, hereinafter referenced as Duanmu).
Regarding Claim 2, Miyaki teaches a computer-implemented method (see Fig. 7A, abstract, para. [0006], para. [0078]. Method operations used for pre-loading a virtual scene of an interactive application) comprising:
determining a gesture of a user that is interacting with a scene of a computer game (see Fig. 3A, Fig. 4, para. [0025]-[0026], para. [0049], para. [0052], para. [0058]. A user wearing a head mounted display (HMD) accesses a local server or a remote server on a cloud system, for example, through a user account and selects an interactive application, such as a video game. In response to the request, the server identifies a virtual scene of the interactive application and provides content of the virtual scene for rendering on a display screen of the HMD. The predictive interactions provided by the user in the virtual scene include movement of an avatar or user representation of a user within the virtual scene, direction of motion, extension of a hand of the user into the virtual scene that can be determined by tracking a wearable device, a trigger of a controller, etc. The user may or may not be wearing a wearable device that can be tracked. The interaction predictor 119 may track the user's hand using sensors built in the HMD or using sensors that are external to the HMD or by tracking a wearable device worn on the hand of the user);
identifying a particular item within the scene of the computer game based on the gesture of the user (see Fig. 3A, Fig. 4, para. [0025]-[0026], para. [0033], para. [0047]-[0062]. A user wearing a head mounted display (HMD) accesses a local server or a remote server on a cloud system, for example, through a user account and selects an interactive application, such as a video game. In response to the request, the server identifies a virtual scene of the interactive application and provides content of the virtual scene for rendering on a display screen of the HMD. For example, FIG. 3A illustrates a virtual scene, scene A 303, of the interactive application. The predictive interactions provided by the user in the virtual scene may include gaze direction, movement of an avatar or user representation of a user within the virtual scene, direction of motion, extension of a hand of the user into the virtual scene that can be determined by tracking a wearable device, a trigger of a controller, etc., or any other cues (i.e., signals) that are indicative of the user's interest toward the different visual options, wherein these interactions are actions provided by the user without actual selection of the visual options. The visual options may be in the form of buttons, interactive images of virtual objects, such as a virtual image of a door, a virtual image of an elevator, a virtual image of a set of stairs, etc., or an interactive image of a scene, or a floating menu of options for accessing the different virtual scenes. As the user interacts with the content of the virtual scene, the user's action of moving his head in a particular direction may be used to identify a gaze indicator. Similarly, the user action of moving toward a particular object (e.g., visual option) within the virtual scene may be used to identify a movement indicator, etc. Each of these indicators identified based on the user's interaction in the virtual scene may be accorded different weights. The closer the user gets to the visual option, the greater the weight accorded to the movement indicator. FIG. 4 illustrates on example, wherein the interaction predictor module 119 monitors the user's interactivity within the virtual scene of FIG. 3A over time and evaluates the interactive indicators identified from the user's interactivity to predict the user's imminent selection of a visual option for pre-loading a corresponding virtual scene for user interaction. As shown in FIG. 4, the interaction predictor module 119 detects that at time t0 the user's gaze direction is directed toward visual option 305 (image of the elevator) in the virtual scene A 303, for example. Further, the interaction predictor module 119 may detect the user moving toward the visual option 305. This information is provided to the threshold computator engine 121 which generates a cumulative representation of the interactive indicators. As a result, the cumulative interactive indicators generated by the threshold computator engine 121, for time t0 is represented as (G1 w1+M w1), wherein G1 represents gaze interactivity in direction 1 (i.e., toward visual option 305), w1 is the weight accorded to the gaze interactivity at time t0 and M1 is the movement interactivity in direction 1. At time t1, the interaction predictor module 119 detects that the user's interactivity that includes gaze activity and movement activity continues to be directed toward visual option 305. Further, at time t1, the user has extended his hand in direction 1 that points toward the of the visual option 305. As a result, at time t1, cumulative interactive indicators generated by the threshold computator engine 121 is represented as (G1 w2+M1 w2+HE1 w1), wherein HE1 represents the hand extension activity in direction 1 and w2 is the weight accorded to the various activities at time t.sub.1. The interaction predictor module 119 continues to evaluate the interactive indicators during the user's interaction with the virtual scene. Based on the evaluation, at time t2, the cumulative interactive indicators generated by the threshold computator engine 121 may be represented as (G1 w3+M1 w3+HE1 w2), wherein w3 is the weight accorded to the gaze and movement indicators as each of these indicators remain in direction 1 at time t3. Based on the evaluation, it may be determined that the cumulative actions weights of the various interactive indicators may have reached the pre-defined threshold value at time t2, indicating that the user's imminent selection is visual option 305. In response, the threshold computator engine 121 sends a signal to the interaction predictor 119 to indicate that visual option 305 was a target of imminent selection by the user, in FIG. 4. Accordingly, a second virtual scene associated with the visual option 305 is selected, loaded, cached and kept ready to enable the pre-loader module to execute the code of the second virtual scene when the visual option 305 is selected by the user, to enable full rendering of the second virtual scene for user interaction);
pre-fetching graphics data associated with the particular item within the scene that was identified based on the gesture of the user (see Fig. 3A, Fig. 4, para. [0007]-[0008], para. [0030]-[0031], para. [0049]-[0050], para. [0054]-[0057], para. [0062], para. [0065]. Based on the prediction of a visual option selection, content of a different virtual scene associated with the visual option is pre-loaded and cached into memory in advance of the user selection of the visual option to access the different virtual scene so that the content can be readily rendered and made available for user interaction); and
after the graphics data is pre-fetched, providing, for output to the user (see Figs. 3A-4, para. [0041], para. [0046], para. [0049]-[0050], para. [0054]-[0057],para. [0062], para. [0065], para. [0079]. The application pre-loader module 115 is configured to process the request for an interactive application, load, cache, and execute appropriate virtual scenes of the interactive application and provide relevant content of the interactive application to the client device for rendering. As part of processing, the interaction predictor module 119 may cumulate the interactivity of the user toward all visual options and use the cumulated interactivity to predict the user's imminent selection of a specific one of the visual options in the virtual scene and pre-load the corresponding virtual scene for user interaction. , a virtual scene selector sub-module 117 of the pre-loader module 115 selects a virtual scene of the application and presents relevant content of the virtual scene for user interaction).
Miyaki does not explicitly disclose the graphic data is for rendering the particular item with an enhanced image quality; and providing, for output to the user, the particular item within the scene, rendered with the enhanced image quality.
However, Duanmu teaches the graphic data is for rendering the particular item with an enhanced image quality and after the graphics data is pre-fetch, providing, for output to the user, the particular item within the scene, rendered with the enhanced image quality (see Fig. 3, Fig. 5 para. [0050]-[0052], para. [0080]-[0096], and para. [0104]-[0108]. Immersive video content can be transmitted from a content source (e.g., a video content server) to a display device (e.g., a wearable display device) according to a view-adaptive prefetching technique. A first set of data can represent the video content according to a low level of detail, such that the data can be transmitted and/or presented using a smaller amount of computational and network resources. Further, additional sets of data can represent the same video content according to progressively higher level of detail, such that the video content can be presented with a higher quality level (e.g., with the trade-off that a larger amount of computation and network resources may be expended). Additional sets of data 304a-304n can represent the same portion of the video content according to progressively higher levels of detail, such that the video content can be presented with a higher quality level. As example, each of the additional sets of data 304a can include data presenting the same portion of the video content as that of the first set of data 302 (e.g., the portion of the video content that is intended to be presented to a user at a display time T). As shown in FIG. 3, at a time t+m subsequent to the time t and prior to the display time T, a portion of the additional set of data 304a can be streamed from the video content source 104 to the wearable display device 106 and stored in the data buffer 122. The portion of the additional set of data 304a that is streamed and stored in the data buffer 122 can be selected by predicting the viewport 306 that will be used to present the video content to the user at the display time T. For example, if the predicted viewport 306 at the display time T indicates that a particular region of the video content is expected to be in the user's field of view at the display time T, the portion of the additional set of data 304a corresponding to that region can be selectively streamed and stored in the data buffer 122. The portion of the additional set of data 304n that is streamed and stored in the data buffer 122 also can be selected based the predicted viewport 306 at the display time T. For example, if the predicted viewport 306 at the display time T indicates that a particular region of the video content is expected to be in the user's field of view at the display time T, the portion of the additional set of data 304n corresponding to that region can be selectively streamed and stored in the data buffer 122. As shown in FIG. 5, at one or more times subsequent to the time t and prior to the display time T (e.g., at a time t+m), portions of one or more of the point clouds 400b-400d can be streamed from the video content source 104 to the wearable display device 106 and stored in the data buffer 122. In some implementations, each of the portions of the point clouds 400b-400d that are streamed and stored in the data buffer 122 can include sufficient information to display particular portions of the object at the display time T, according to levels of detail greater than the default level of detail. The composite point cloud 504 can include portions that enable certain other regions of the object to be presented according to higher levels of detail (e.g., the regions corresponding to the portions of the point clouds 400b-400d stored in the data buffer 122). If the actual viewport at the display time T coincides with the predicted viewport 502 (e.g., the viewing perspective of the user at the display time T was accurately predicted), the region of the object within the actual viewport can be presented to the user according to a level of detail that is higher than the default level of detail)
Miyaki and Duanmu are related to display devices, thus one of ordinary skill in the art, before the effective filing date of the claimed invention, would have recognized the obviousness of modifying the method disclosed by Miyaki with Duanmu’s teachings, since it would have been beneficial by enabling the display device to present video content corresponding to the predicted viewing perspective according to a higher level of detail in some situations (e.g., if the viewing perspective of the user at the display time coincides with the predicted viewing perspective), while also enabling the display device to present video content corresponding to any other viewing perspective according to a lower level of detail in other situations (e.g., if the viewing perspective of the user at the display device does not coincide with the predicted viewing perspective, and/or the performance of the network is degraded). Accordingly, the presentation of video content remains uninterrupted, even if the user's behavior and/or inputs are different than expected (see Duanmu para. [0051] and para. [0091]).
Regarding Claim 3, Miyaki and Duanmu teach the method of claim 2.
Miyaki further teaches wherein the gesture is other than a gaze of the user (see Fig. 3A, Fig. 4, para. [0049], para. [0052], para. [0058]-[0062]. Each action identifies an interactive indicator. For example, as the user interacts with the content of the virtual scene, the user's action of moving his head in a particular direction may be used to identify a gaze indicator. Similarly, the user action of moving toward a particular object (e.g., visual option) within the virtual scene may be used to identify a movement indicator, etc. If the user moves within the virtual scene, the movement indicator in a particular direction may be accorded different weight based on the user continuing to move in the particular direction and based on proximity to a visual option. The closer the user gets to the visual option, the greater the weight accorded to the movement indicator. M1 is the movement interactivity in direction 1 and HE1 represents the hand extension activity).
Regarding Claim 4, Miyaki and Duanmu teach the method of claim 2.
Miyaki further teaches wherein the gesture comprises a head movement of the user (see Fig. 3A, 3C, Fig. 4, para. [0049], para. [0052], para. [0058]. As the user interacts with the virtual scene, (e.g., moves his head in the direction of a door to a guest room or moves toward a specific guest room door or extends his hand toward a visual option), the user interactivities are evaluated to determine if the actions are indicative of imminent selection of a visual option within scene C′. Each action identifies an interactive indicator. For example, as the user interacts with the content of the virtual scene, the user's action of moving his head in a particular direction may be used to identify a gaze indicator).
Regarding Claim 5, Miyaki and Duanmu teach the method of claim 2.
Miyaki further teaches wherein the gesture comprises a hand movement of the user (see, Fig. 4, para. [0049], para. [0052], para. [0060]-[0062]. The predictive interactions provided by the user in the virtual scene may include gaze direction, movement of an avatar or user representation of a user within the virtual scene, direction of motion, extension of a hand of the user into the virtual scene that can be determined by tracking a wearable device, a trigger of a controller, etc., or any other cues (i.e., signals) that are indicative of the user's interest toward the different visual options, wherein these interactions are actions provided by the user without actual selection of the visual options. As the user interacts with the virtual scene, (e.g., moves his head in the direction of a door to a guest room or moves toward a specific guest room door or extends his hand toward a visual option), the user interactivities are evaluated to determine if the actions are indicative of imminent selection of a visual option within scene C′. As depicted in Fig. 4 the user has extended his hand in direction 1 that points toward the of the visual option 305. Wherein HE1 represents the hand extension activity in direction 1 and w2 is the weight accorded to the various activities at time t1).
Regarding Claim 6, Miyaki and Duanmu teach the method of claim 2.
Miyaki further teaches wherein the gesture comprises a body movement of the user (see Fig. 4, para. [0030], para. [0049], para. [0058]-[0061], Claim 11. The predictive interactions provided by the user in the virtual scene may include gaze direction, movement of an avatar or user representation of a user within the virtual scene, direction of motion, extension of a hand of the user into the virtual scene that can be determined by tracking a wearable device, a trigger of a controller, etc., or any other cues (i.e., signals) that are indicative of the user's interest toward the different visual options, wherein these interactions are actions provided by the user without actual selection of the visual options. the user action of moving toward a particular object (e.g., visual option) within the virtual scene may be used to identify a movement indicator, etc. If the user moves within the virtual scene, the movement indicator in a particular direction may be accorded different weight based on the user continuing to move in the particular direction and based on proximity to a visual option. The closer the user gets to the visual option, the greater the weight accorded to the movement indicator. As depicted in Fig. 4, M1 is the movement interactivity in direction 1).
Regarding Claim 7, Miyaki and Duanmu teach the method of claim 2.
Miyaki further teaches wherein the gesture comprises a body language signal of the user (see para. [0030], para. [0049], para. [0058]-[0061], Claim 11. The predictive interactions provided by the user in the virtual scene may include gaze direction, movement of an avatar or user representation of a user within the virtual scene, direction of motion, extension of a hand of the user into the virtual scene that can be determined by tracking a wearable device, a trigger of a controller, etc., or any other cues (i.e., signals) that are indicative of the user's interest toward the different visual options, wherein these interactions are actions provided by the user without actual selection of the visual options. the user action of moving toward a particular object (e.g., visual option) within the virtual scene may be used to identify a movement indicator, etc. If the user moves within the virtual scene, the movement indicator in a particular direction may be accorded different weight based on the user continuing to move in the particular direction and based on proximity to a visual option. The closer the user gets to the visual option, the greater the weight accorded to the movement indicator. As depicted in Fig. 4, M1 is the movement interactivity in direction 1).
Regarding Claim 8, Miyaki and Duanmu teach the method of claim 2.
Miyaki further teaches wherein the graphics data associated with the particular item within the scene is pre-fetched before the user interacts with the particular item within the scene (see Fig. 3A, Fig. 4, para. [0007]-[0008], para. [0030]-[0031], para. [0049]-[0062], para. [0065]. Based on the prediction of a visual option selection, content of a different virtual scene associated with the visual option is pre-loaded and cached into memory in advance of the user selection of the visual option to access the different virtual scene so that the content can be readily rendered and made available for user interaction).
Regarding Claim 9, Miyaki teaches a non-transitory computer-readable medium that store instructions (see para. [0159]-[0161]. Computer readable code on a computer readable medium. Examples of the computer readable medium include hard drives, network attached storage (NAS), read-only memory, random-access memory, CD-ROMs, CD-Rs, CD-RWs, magnetic tapes, and other optical and non-optical data storage devices) which, when executed by a computer processor, cause the computer processor to perform operations (see para. [0158]-[0161]. The apparatus can be specially constructed for the required purpose, or the apparatus can be a general-purpose computer selectively activated or configured by a computer program stored in the computer. In particular, various general-purpose machines can be used with computer programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations. The above described invention may be practiced with other computer system configurations including hand-held devices, microprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers and the like) comprising:
determining a gesture of a user that is interacting with a scene of a computer game (see Fig. 3A, Fig. 4, para. [0025]-[0026], para. [0049], para. [0052], para. [0058]. A user wearing a head mounted display (HMD) accesses a local server or a remote server on a cloud system, for example, through a user account and selects an interactive application, such as a video game. In response to the request, the server identifies a virtual scene of the interactive application and provides content of the virtual scene for rendering on a display screen of the HMD. The predictive interactions provided by the user in the virtual scene include movement of an avatar or user representation of a user within the virtual scene, direction of motion, extension of a hand of the user into the virtual scene that can be determined by tracking a wearable device, a trigger of a controller, etc. The user may or may not be wearing a wearable device that can be tracked. The interaction predictor 119 may track the user's hand using sensors built in the HMD or using sensors that are external to the HMD or by tracking a wearable device worn on the hand of the user);
identifying a particular item within the scene of the computer game based on the gesture of the user (see Fig. 3A, Fig. 4, para. [0025]-[0026], para. [0033], para. [0047]-[0062]. A user wearing a head mounted display (HMD) accesses a local server or a remote server on a cloud system, for example, through a user account and selects an interactive application, such as a video game. In response to the request, the server identifies a virtual scene of the interactive application and provides content of the virtual scene for rendering on a display screen of the HMD. For example, FIG. 3A illustrates a virtual scene, scene A 303, of the interactive application. The predictive interactions provided by the user in the virtual scene may include gaze direction, movement of an avatar or user representation of a user within the virtual scene, direction of motion, extension of a hand of the user into the virtual scene that can be determined by tracking a wearable device, a trigger of a controller, etc., or any other cues (i.e., signals) that are indicative of the user's interest toward the different visual options, wherein these interactions are actions provided by the user without actual selection of the visual options. The visual options may be in the form of buttons, interactive images of virtual objects, such as a virtual image of a door, a virtual image of an elevator, a virtual image of a set of stairs, etc., or an interactive image of a scene, or a floating menu of options for accessing the different virtual scenes. As the user interacts with the content of the virtual scene, the user's action of moving his head in a particular direction may be used to identify a gaze indicator. Similarly, the user action of moving toward a particular object (e.g., visual option) within the virtual scene may be used to identify a movement indicator, etc. Each of these indicators identified based on the user's interaction in the virtual scene may be accorded different weights. The closer the user gets to the visual option, the greater the weight accorded to the movement indicator. FIG. 4 illustrates on example, wherein the interaction predictor module 119 monitors the user's interactivity within the virtual scene of FIG. 3A over time and evaluates the interactive indicators identified from the user's interactivity to predict the user's imminent selection of a visual option for pre-loading a corresponding virtual scene for user interaction. As shown in FIG. 4, the interaction predictor module 119 detects that at time t0 the user's gaze direction is directed toward visual option 305 (image of the elevator) in the virtual scene A 303, for example. Further, the interaction predictor module 119 may detect the user moving toward the visual option 305. This information is provided to the threshold computator engine 121 which generates a cumulative representation of the interactive indicators. As a result, the cumulative interactive indicators generated by the threshold computator engine 121, for time t0 is represented as (G1 w1+M w1), wherein G1 represents gaze interactivity in direction 1 (i.e., toward visual option 305), w1 is the weight accorded to the gaze interactivity at time t0 and M1 is the movement interactivity in direction 1. At time t1, the interaction predictor module 119 detects that the user's interactivity that includes gaze activity and movement activity continues to be directed toward visual option 305. Further, at time t1, the user has extended his hand in direction 1 that points toward the of the visual option 305. As a result, at time t1, cumulative interactive indicators generated by the threshold computator engine 121 is represented as (G1 w2+M1 w2+HE1 w1), wherein HE1 represents the hand extension activity in direction 1 and w2 is the weight accorded to the various activities at time t.sub.1. The interaction predictor module 119 continues to evaluate the interactive indicators during the user's interaction with the virtual scene. Based on the evaluation, at time t2, the cumulative interactive indicators generated by the threshold computator engine 121 may be represented as (G1 w3+M1 w3+HE1 w2), wherein w3 is the weight accorded to the gaze and movement indicators as each of these indicators remain in direction 1 at time t3. Based on the evaluation, it may be determined that the cumulative actions weights of the various interactive indicators may have reached the pre-defined threshold value at time t2, indicating that the user's imminent selection is visual option 305. In response, the threshold computator engine 121 sends a signal to the interaction predictor 119 to indicate that visual option 305 was a target of imminent selection by the user, in FIG. 4. Accordingly, a second virtual scene associated with the visual option 305 is selected, loaded, cached and kept ready to enable the pre-loader module to execute the code of the second virtual scene when the visual option 305 is selected by the user, to enable full rendering of the second virtual scene for user interaction); and
pre-fetching graphics data associated with the particular item within the scene that was identified based on the gesture of the user (see Fig. 3A, Fig. 4, para. [0007]-[0008], para. [0030]-[0031], para. [0049]-[0050], para. [0054]-[0057], para. [0062], para. [0065]. Based on the prediction of a visual option selection, content of a different virtual scene associated with the visual option is pre-loaded and cached into memory in advance of the user selection of the visual option to access the different virtual scene so that the content can be readily rendered and made available for user interaction); and
after the graphics data is pre-fetched, providing, for output to the user (see Figs. 3A-4, para. [0041], para. [0046], para. [0049]-[0050], para. [0054]-[0057],para. [0062], para. [0065], para. [0079]. The application pre-loader module 115 is configured to process the request for an interactive application, load, cache, and execute appropriate virtual scenes of the interactive application and provide relevant content of the interactive application to the client device for rendering. As part of processing, the interaction predictor module 119 may cumulate the interactivity of the user toward all visual options and use the cumulated interactivity to predict the user's imminent selection of a specific one of the visual options in the virtual scene and pre-load the corresponding virtual scene for user interaction. , a virtual scene selector sub-module 117 of the pre-loader module 115 selects a virtual scene of the application and presents relevant content of the virtual scene for user interaction).
Miyaki does not explicitly disclose the graphic data is for rendering the particular item with an enhanced image quality; and providing, for output to the user, the particular item within the scene, rendered with the enhanced image quality.
However, Duanmu teaches the graphic data is for rendering the particular item with an enhanced image quality and after the graphics data is pre-fetch, providing, for output to the user, the particular item within the scene, rendered with the enhanced image quality (see Fig. 3, Fig. 5 para. [0050]-[0052], para. [0080]-[0096], and para. [0104]-[0108]. Immersive video content can be transmitted from a content source (e.g., a video content server) to a display device (e.g., a wearable display device) according to a view-adaptive prefetching technique. A first set of data can represent the video content according to a low level of detail, such that the data can be transmitted and/or presented using a smaller amount of computational and network resources. Further, additional sets of data can represent the same video content according to progressively higher level of detail, such that the video content can be presented with a higher quality level (e.g., with the trade-off that a larger amount of computation and network resources may be expended). Additional sets of data 304a-304n can represent the same portion of the video content according to progressively higher levels of detail, such that the video content can be presented with a higher quality level. As example, each of the additional sets of data 304a can include data presenting the same portion of the video content as that of the first set of data 302 (e.g., the portion of the video content that is intended to be presented to a user at a display time T). As shown in FIG. 3, at a time t+m subsequent to the time t and prior to the display time T, a portion of the additional set of data 304a can be streamed from the video content source 104 to the wearable display device 106 and stored in the data buffer 122. The portion of the additional set of data 304a that is streamed and stored in the data buffer 122 can be selected by predicting the viewport 306 that will be used to present the video content to the user at the display time T. For example, if the predicted viewport 306 at the display time T indicates that a particular region of the video content is expected to be in the user's field of view at the display time T, the portion of the additional set of data 304a corresponding to that region can be selectively streamed and stored in the data buffer 122. The portion of the additional set of data 304n that is streamed and stored in the data buffer 122 also can be selected based the predicted viewport 306 at the display time T. For example, if the predicted viewport 306 at the display time T indicates that a particular region of the video content is expected to be in the user's field of view at the display time T, the portion of the additional set of data 304n corresponding to that region can be selectively streamed and stored in the data buffer 122. As shown in FIG. 5, at one or more times subsequent to the time t and prior to the display time T (e.g., at a time t+m), portions of one or more of the point clouds 400b-400d can be streamed from the video content source 104 to the wearable display device 106 and stored in the data buffer 122. In some implementations, each of the portions of the point clouds 400b-400d that are streamed and stored in the data buffer 122 can include sufficient information to display particular portions of the object at the display time T, according to levels of detail greater than the default level of detail. The composite point cloud 504 can include portions that enable certain other regions of the object to be presented according to higher levels of detail (e.g., the regions corresponding to the portions of the point clouds 400b-400d stored in the data buffer 122). If the actual viewport at the display time T coincides with the predicted viewport 502 (e.g., the viewing perspective of the user at the display time T was accurately predicted), the region of the object within the actual viewport can be presented to the user according to a level of detail that is higher than the default level of detail)
Miyaki and Duanmu are related to display devices, thus one of ordinary skill in the art, before the effective filing date of the claimed invention, would have recognized the obviousness of modifying the medium and the instructions disclosed by Miyaki with Duanmu’s teachings, since it would have been beneficial by enabling the display device to present video content corresponding to the predicted viewing perspective according to a higher level of detail in some situations (e.g., if the viewing perspective of the user at the display time coincides with the predicted viewing perspective), while also enabling the display device to present video content corresponding to any other viewing perspective according to a lower level of detail in other situations (e.g., if the viewing perspective of the user at the display device does not coincide with the predicted viewing perspective, and/or the performance of the network is degraded). Accordingly, the presentation of video content remains uninterrupted, even if the user's behavior and/or inputs are different than expected (see Duanmu para. [0051] and para. [0091]).
Regarding Claim 10, Miyaki and Duanmu teach the non-transitory computer-readable medium of claim 9.
wherein the gesture is other than a gaze of the user (see Fig. 3A, Fig. 4, para. [0049], para. [0052], para. [0058]-[0062]. Each action identifies an interactive indicator. For example, as the user interacts with the content of the virtual scene, the user's action of moving his head in a particular direction may be used to identify a gaze indicator. Similarly, the user action of moving toward a particular object (e.g., visual option) within the virtual scene may be used to identify a movement indicator, etc. If the user moves within the virtual scene, the movement indicator in a particular direction may be accorded different weight based on the user continuing to move in the particular direction and based on proximity to a visual option. The closer the user gets to the visual option, the greater the weight accorded to the movement indicator. M1 is the movement interactivity in direction 1 and HE1 represents the hand extension activity).
Regarding Claim 11, Miyaki and Duanmu teach the non-transitory computer-readable medium of claim 9.
wherein the gesture comprises a head movement of the user(see Fig. 3A, 3C, Fig. 4, para. [0049], para. [0052], para. [0058]. As the user interacts with the virtual scene, (e.g., moves his head in the direction of a door to a guest room or moves toward a specific guest room door or extends his hand toward a visual option), the user interactivities are evaluated to determine if the actions are indicative of imminent selection of a visual option within scene C′. Each action identifies an interactive indicator. For example, as the user interacts with the content of the virtual scene, the user's action of moving his head in a particular direction may be used to identify a gaze indicator).
Regarding Claim 12, Miyaki and Duanmu teach the non-transitory computer-readable medium of claim 9.
Miyaki further teaches wherein the gesture comprises a hand movement of the user (see, Fig. 4, para. [0049], para. [0052], para. [0060]-[0062]. The predictive interactions provided by the user in the virtual scene may include gaze direction, movement of an avatar or user representation of a user within the virtual scene, direction of motion, extension of a hand of the user into the virtual scene that can be determined by tracking a wearable device, a trigger of a controller, etc., or any other cues (i.e., signals) that are indicative of the user's interest toward the different visual options, wherein these interactions are actions provided by the user without actual selection of the visual options. As the user interacts with the virtual scene, (e.g., moves his head in the direction of a door to a guest room or moves toward a specific guest room door or extends his hand toward a visual option), the user interactivities are evaluated to determine if the actions are indicative of imminent selection of a visual option within scene C′. As depicted in Fig. 4 the user has extended his hand in direction 1 that points toward the of the visual option 305. Wherein HE1 represents the hand extension activity in direction 1 and w2 is the weight accorded to the various activities at time t1).
Regarding Claim 13, Miyaki and Duanmu teach the non-transitory computer-readable medium of claim 9.
Miyaki further teaches wherein the gesture comprises a body movement of the user (see Fig. 4, para. [0030], para. [0049], para. [0058]-[0061], Claim 11. The predictive interactions provided by the user in the virtual scene may include gaze direction, movement of an avatar or user representation of a user within the virtual scene, direction of motion, extension of a hand of the user into the virtual scene that can be determined by tracking a wearable device, a trigger of a controller, etc., or any other cues (i.e., signals) that are indicative of the user's interest toward the different visual options, wherein these interactions are actions provided by the user without actual selection of the visual options. the user action of moving toward a particular object (e.g., visual option) within the virtual scene may be used to identify a movement indicator, etc. If the user moves within the virtual scene, the movement indicator in a particular direction may be accorded different weight based on the user continuing to move in the particular direction and based on proximity to a visual option. The closer the user gets to the visual option, the greater the weight accorded to the movement indicator. As depicted in Fig. 4, M1 is the movement interactivity in direction 1).
Regarding Claim 14, Miyaki and Duanmu teach the non-transitory computer-readable medium of claim 9.
Miyaki further teaches wherein the gesture comprises a body language signal of the user (see para. [0030], para. [0049], para. [0058]-[0061], Claim 11. The predictive interactions provided by the user in the virtual scene may include gaze direction, movement of an avatar or user representation of a user within the virtual scene, direction of motion, extension of a hand of the user into the virtual scene that can be determined by tracking a wearable device, a trigger of a controller, etc., or any other cues (i.e., signals) that are indicative of the user's interest toward the different visual options, wherein these interactions are actions provided by the user without actual selection of the visual options. the user action of moving toward a particular object (e.g., visual option) within the virtual scene may be used to identify a movement indicator, etc. If the user moves within the virtual scene, the movement indicator in a particular direction may be accorded different weight based on the user continuing to move in the particular direction and based on proximity to a visual option. The closer the user gets to the visual option, the greater the weight accorded to the movement indicator. As depicted in Fig. 4, M1 is the movement interactivity in direction 1).
Regarding Claim 15, Miyaki and Duanmu teach the non-transitory computer-readable medium of claim 9.
Miyaki further teaches wherein the graphics data associated with the particular item within the scene is pre-fetched before the user interacts with the particular item within the scene (see Fig. 3A, Fig. 4, para. [0007]-[0008], para. [0030]-[0031], para. [0049]-[0062], para. [0065]. Based on the prediction of a visual option selection, content of a different virtual scene associated with the visual option is pre-loaded and cached into memory in advance of the user selection of the visual option to access the different virtual scene so that the content can be readily rendered and made available for user interaction).
Regarding Claim 16, Miyaki teaches a system (see abstract, para. [0158]-[0161]. The invention may employ various computer-implemented operations involving data stored in computer systems) comprising:
a computer processor (see para. [0159]-[0161]. The apparatus can be specially constructed for the required purpose, or the apparatus can be a general-purpose computer selectively activated or configured by a computer program stored in the computer. In particular, various general-purpose machines can be used with computer programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations. The above described invention may be practiced with other computer system configurations including hand-held devices, microprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers and the like); and
a non-transitory computer-readable medium that store instructions (see para. [0159]-[0161]. Computer readable code on a computer readable medium. Examples of the computer readable medium include hard drives, network attached storage (NAS), read-only memory, random-access memory, CD-ROMs, CD-Rs, CD-RWs, magnetic tapes, and other optical and non-optical data storage devices) which, when executed by the computer processor, cause the computer processor to perform operations (see para. [0158]-[0161]. The apparatus can be specially constructed for the required purpose, or the apparatus can be a general-purpose computer selectively activated or configured by a computer program stored in the computer. In particular, various general-purpose machines can be used with computer programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations. The above described invention may be practiced with other computer system configurations including hand-held devices, microprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers and the like) comprising:
determining a gesture of a user that is interacting with a scene of a computer game (see Fig. 3A, Fig. 4, para. [0025]-[0026], para. [0049], para. [0052], para. [0058]. A user wearing a head mounted display (HMD) accesses a local server or a remote server on a cloud system, for example, through a user account and selects an interactive application, such as a video game. In response to the request, the server identifies a virtual scene of the interactive application and provides content of the virtual scene for rendering on a display screen of the HMD. The predictive interactions provided by the user in the virtual scene include movement of an avatar or user representation of a user within the virtual scene, direction of motion, extension of a hand of the user into the virtual scene that can be determined by tracking a wearable device, a trigger of a controller, etc. The user may or may not be wearing a wearable device that can be tracked. The interaction predictor 119 may track the user's hand using sensors built in the HMD or using sensors that are external to the HMD or by tracking a wearable device worn on the hand of the user);
identifying a particular item within the scene of the computer game based on the gesture of the user (see Fig. 3A, Fig. 4, para. [0025]-[0026], para. [0033], para. [0047]-[0062]. A user wearing a head mounted display (HMD) accesses a local server or a remote server on a cloud system, for example, through a user account and selects an interactive application, such as a video game. In response to the request, the server identifies a virtual scene of the interactive application and provides content of the virtual scene for rendering on a display screen of the HMD. For example, FIG. 3A illustrates a virtual scene, scene A 303, of the interactive application. The predictive interactions provided by the user in the virtual scene may include gaze direction, movement of an avatar or user representation of a user within the virtual scene, direction of motion, extension of a hand of the user into the virtual scene that can be determined by tracking a wearable device, a trigger of a controller, etc., or any other cues (i.e., signals) that are indicative of the user's interest toward the different visual options, wherein these interactions are actions provided by the user without actual selection of the visual options. The visual options may be in the form of buttons, interactive images of virtual objects, such as a virtual image of a door, a virtual image of an elevator, a virtual image of a set of stairs, etc., or an interactive image of a scene, or a floating menu of options for accessing the different virtual scenes. As the user interacts with the content of the virtual scene, the user's action of moving his head in a particular direction may be used to identify a gaze indicator. Similarly, the user action of moving toward a particular object (e.g., visual option) within the virtual scene may be used to identify a movement indicator, etc. Each of these indicators identified based on the user's interaction in the virtual scene may be accorded different weights. The closer the user gets to the visual option, the greater the weight accorded to the movement indicator. FIG. 4 illustrates on example, wherein the interaction predictor module 119 monitors the user's interactivity within the virtual scene of FIG. 3A over time and evaluates the interactive indicators identified from the user's interactivity to predict the user's imminent selection of a visual option for pre-loading a corresponding virtual scene for user interaction. As shown in FIG. 4, the interaction predictor module 119 detects that at time t0 the user's gaze direction is directed toward visual option 305 (image of the elevator) in the virtual scene A 303, for example. Further, the interaction predictor module 119 may detect the user moving toward the visual option 305. This information is provided to the threshold computator engine 121 which generates a cumulative representation of the interactive indicators. As a result, the cumulative interactive indicators generated by the threshold computator engine 121, for time t0 is represented as (G1 w1+M w1), wherein G1 represents gaze interactivity in direction 1 (i.e., toward visual option 305), w1 is the weight accorded to the gaze interactivity at time t0 and M1 is the movement interactivity in direction 1. At time t1, the interaction predictor module 119 detects that the user's interactivity that includes gaze activity and movement activity continues to be directed toward visual option 305. Further, at time t1, the user has extended his hand in direction 1 that points toward the of the visual option 305. As a result, at time t1, cumulative interactive indicators generated by the threshold computator engine 121 is represented as (G1 w2+M1 w2+HE1 w1), wherein HE1 represents the hand extension activity in direction 1 and w2 is the weight accorded to the various activities at time t.sub.1. The interaction predictor module 119 continues to evaluate the interactive indicators during the user's interaction with the virtual scene. Based on the evaluation, at time t2, the cumulative interactive indicators generated by the threshold computator engine 121 may be represented as (G1 w3+M1 w3+HE1 w2), wherein w3 is the weight accorded to the gaze and movement indicators as each of these indicators remain in direction 1 at time t3. Based on the evaluation, it may be determined that the cumulative actions weights of the various interactive indicators may have reached the pre-defined threshold value at time t2, indicating that the user's imminent selection is visual option 305. In response, the threshold computator engine 121 sends a signal to the interaction predictor 119 to indicate that visual option 305 was a target of imminent selection by the user, in FIG. 4. Accordingly, a second virtual scene associated with the visual option 305 is selected, loaded, cached and kept ready to enable the pre-loader module to execute the code of the second virtual scene when the visual option 305 is selected by the user, to enable full rendering of the second virtual scene for user interaction); and
pre-fetching graphics data associated with the particular item within the scene that was identified based on the gesture of the user (see Fig. 3A, Fig. 4, para. [0007]-[0008], para. [0030]-[0031], para. [0049]-[0050], para. [0054]-[0057], para. [0062], para. [0065]. Based on the prediction of a visual option selection, content of a different virtual scene associated with the visual option is pre-loaded and cached into memory in advance of the user selection of the visual option to access the different virtual scene so that the content can be readily rendered and made available for user interaction); and
after the graphics data is pre-fetched, providing, for output to the user (see Figs. 3A-4, para. [0041], para. [0046], para. [0049]-[0050], para. [0054]-[0057],para. [0062], para. [0065], para. [0079]. The application pre-loader module 115 is configured to process the request for an interactive application, load, cache, and execute appropriate virtual scenes of the interactive application and provide relevant content of the interactive application to the client device for rendering. As part of processing, the interaction predictor module 119 may cumulate the interactivity of the user toward all visual options and use the cumulated interactivity to predict the user's imminent selection of a specific one of the visual options in the virtual scene and pre-load the corresponding virtual scene for user interaction. , a virtual scene selector sub-module 117 of the pre-loader module 115 selects a virtual scene of the application and presents relevant content of the virtual scene for user interaction).
Miyaki does not explicitly disclose the graphic data is for rendering the particular item with an enhanced image quality; and providing, for output to the user, the particular item within the scene, rendered with the enhanced image quality.
However, Duanmu teaches the graphic data is for rendering the particular item with an enhanced image quality and after the graphics data is pre-fetch, providing, for output to the user, the particular item within the scene, rendered with the enhanced image quality (see Fig. 3, Fig. 5 para. [0050]-[0052], para. [0080]-[0096], and para. [0104]-[0108]. Immersive video content can be transmitted from a content source (e.g., a video content server) to a display device (e.g., a wearable display device) according to a view-adaptive prefetching technique. A first set of data can represent the video content according to a low level of detail, such that the data can be transmitted and/or presented using a smaller amount of computational and network resources. Further, additional sets of data can represent the same video content according to progressively higher level of detail, such that the video content can be presented with a higher quality level (e.g., with the trade-off that a larger amount of computation and network resources may be expended). Additional sets of data 304a-304n can represent the same portion of the video content according to progressively higher levels of detail, such that the video content can be presented with a higher quality level. As example, each of the additional sets of data 304a can include data presenting the same portion of the video content as that of the first set of data 302 (e.g., the portion of the video content that is intended to be presented to a user at a display time T). As shown in FIG. 3, at a time t+m subsequent to the time t and prior to the display time T, a portion of the additional set of data 304a can be streamed from the video content source 104 to the wearable display device 106 and stored in the data buffer 122. The portion of the additional set of data 304a that is streamed and stored in the data buffer 122 can be selected by predicting the viewport 306 that will be used to present the video content to the user at the display time T. For example, if the predicted viewport 306 at the display time T indicates that a particular region of the video content is expected to be in the user's field of view at the display time T, the portion of the additional set of data 304a corresponding to that region can be selectively streamed and stored in the data buffer 122. The portion of the additional set of data 304n that is streamed and stored in the data buffer 122 also can be selected based the predicted viewport 306 at the display time T. For example, if the predicted viewport 306 at the display time T indicates that a particular region of the video content is expected to be in the user's field of view at the display time T, the portion of the additional set of data 304n corresponding to that region can be selectively streamed and stored in the data buffer 122. As shown in FIG. 5, at one or more times subsequent to the time t and prior to the display time T (e.g., at a time t+m), portions of one or more of the point clouds 400b-400d can be streamed from the video content source 104 to the wearable display device 106 and stored in the data buffer 122. In some implementations, each of the portions of the point clouds 400b-400d that are streamed and stored in the data buffer 122 can include sufficient information to display particular portions of the object at the display time T, according to levels of detail greater than the default level of detail. The composite point cloud 504 can include portions that enable certain other regions of the object to be presented according to higher levels of detail (e.g., the regions corresponding to the portions of the point clouds 400b-400d stored in the data buffer 122). If the actual viewport at the display time T coincides with the predicted viewport 502 (e.g., the viewing perspective of the user at the display time T was accurately predicted), the region of the object within the actual viewport can be presented to the user according to a level of detail that is higher than the default level of detail)
Miyaki and Duanmu are related to display devices, thus one of ordinary skill in the art, before the effective filing date of the claimed invention, would have recognized the obviousness of modifying the system disclosed by Miyaki with Duanmu’s teachings, since it would have been beneficial by enabling the display device to present video content corresponding to the predicted viewing perspective according to a higher level of detail in some situations (e.g., if the viewing perspective of the user at the display time coincides with the predicted viewing perspective), while also enabling the display device to present video content corresponding to any other viewing perspective according to a lower level of detail in other situations (e.g., if the viewing perspective of the user at the display device does not coincide with the predicted viewing perspective, and/or the performance of the network is degraded). Accordingly, the presentation of video content remains uninterrupted, even if the user's behavior and/or inputs are different than expected (see Duanmu para. [0051] and para. [0091]).
Regarding Claim 17, Miyaki and Duanmu teach the system of claim 16, wherein the gesture is other than a gaze of the user.
Miyaki further teaches wherein the gesture is other than a gaze of the user (see Fig. 3A, Fig. 4, para. [0049], para. [0052], para. [0058]-[0062]. Each action identifies an interactive indicator. For example, as the user interacts with the content of the virtual scene, the user's action of moving his head in a particular direction may be used to identify a gaze indicator. Similarly, the user action of moving toward a particular object (e.g., visual option) within the virtual scene may be used to identify a movement indicator, etc. If the user moves within the virtual scene, the movement indicator in a particular direction may be accorded different weight based on the user continuing to move in the particular direction and based on proximity to a visual option. The closer the user gets to the visual option, the greater the weight accorded to the movement indicator. M1 is the movement interactivity in direction 1 and HE1 represents the hand extension activity).
Regarding Claim 18, Miyaki and Duanmu teach the system of claim 16, wherein the gesture comprises a head movement of the user.
Miyaki further teaches wherein the gesture comprises a head movement of the user (see Fig. 3A, 3C, Fig. 4, para. [0049], para. [0052], para. [0058]. As the user interacts with the virtual scene, (e.g., moves his head in the direction of a door to a guest room or moves toward a specific guest room door or extends his hand toward a visual option), the user interactivities are evaluated to determine if the actions are indicative of imminent selection of a visual option within scene C′. Each action identifies an interactive indicator. For example, as the user interacts with the content of the virtual scene, the user's action of moving his head in a particular direction may be used to identify a gaze indicator).
Regarding Claim 19, Miyaki and Duanmu teach the system of claim 16, wherein the gesture comprises a hand movement of the user.
Miyaki further teaches wherein the gesture comprises a hand movement of the user (see, Fig. 4, para. [0049], para. [0052], para. [0060]-[0062]. The predictive interactions provided by the user in the virtual scene may include gaze direction, movement of an avatar or user representation of a user within the virtual scene, direction of motion, extension of a hand of the user into the virtual scene that can be determined by tracking a wearable device, a trigger of a controller, etc., or any other cues (i.e., signals) that are indicative of the user's interest toward the different visual options, wherein these interactions are actions provided by the user without actual selection of the visual options. As the user interacts with the virtual scene, (e.g., moves his head in the direction of a door to a guest room or moves toward a specific guest room door or extends his hand toward a visual option), the user interactivities are evaluated to determine if the actions are indicative of imminent selection of a visual option within scene C′. As depicted in Fig. 4 the user has extended his hand in direction 1 that points toward the of the visual option 305. Wherein HE1 represents the hand extension activity in direction 1 and w2 is the weight accorded to the various activities at time t1).
Regarding Claim 20, Miyaki and Duanmu teach the system of claim 16, wherein the gesture comprises a body movement of the user.
Miyaki further teaches wherein the gesture comprises a body movement of the user (see Fig. 4, para. [0030], para. [0049], para. [0058]-[0061], Claim 11. The predictive interactions provided by the user in the virtual scene may include gaze direction, movement of an avatar or user representation of a user within the virtual scene, direction of motion, extension of a hand of the user into the virtual scene that can be determined by tracking a wearable device, a trigger of a controller, etc., or any other cues (i.e., signals) that are indicative of the user's interest toward the different visual options, wherein these interactions are actions provided by the user without actual selection of the visual options. the user action of moving toward a particular object (e.g., visual option) within the virtual scene may be used to identify a movement indicator, etc. If the user moves within the virtual scene, the movement indicator in a particular direction may be accorded different weight based on the user continuing to move in the particular direction and based on proximity to a visual option. The closer the user gets to the visual option, the greater the weight accorded to the movement indicator. As depicted in Fig. 4, M1 is the movement interactivity in direction 1).
Regarding Claim 21, Miyaki and Duanmu teach the system of claim 16, wherein the gesture comprises a body language signal of the user.
Miyaki further teaches wherein the gesture comprises a body language signal of the user (see para. [0030], para. [0049], para. [0058]-[0061], Claim 11. The predictive interactions provided by the user in the virtual scene may include gaze direction, movement of an avatar or user representation of a user within the virtual scene, direction of motion, extension of a hand of the user into the virtual scene that can be determined by tracking a wearable device, a trigger of a controller, etc., or any other cues (i.e., signals) that are indicative of the user's interest toward the different visual options, wherein these interactions are actions provided by the user without actual selection of the visual options. the user action of moving toward a particular object (e.g., visual option) within the virtual scene may be used to identify a movement indicator, etc. If the user moves within the virtual scene, the movement indicator in a particular direction may be accorded different weight based on the user continuing to move in the particular direction and based on proximity to a visual option. The closer the user gets to the visual option, the greater the weight accorded to the movement indicator. As depicted in Fig. 4, M1 is the movement interactivity in direction 1).
Claim 22 is rejected under 35 U.S.C. 103 as being unpatentable over Miyaki (US 20190250773 A1) in view of Duanmu (US 20220377304 A1), further in view of Gustafsson et al. (US 20170091549 A1, hereinafter referenced as Gustafsson).
Regarding Claim 22, Miyaki and Duanmu teach the method of claim 2.
Miyaki further wherein graphics data associated with the particular item for rendering the particular item includes data related to the particular item (see para. [0030]-[0031]. content of a different virtual scene associated with the visual option is pre-loaded and cached into memory in advance of the user selection of the visual option to access the different virtual scene so that the content can be readily rendered and made available for user interaction. The content for any virtual scene include graphic intensive content, such as artwork, 3D graphics, 3D dynamic characters, artificial intelligence (AI) characters, etc. In some implementations, the data related to the content is provided as input into a graphics processor that uses random generator technique to generate variations of the content that include variations in controlled movement of different characters or objects, so as to render variations of the content for different views).
Duanmu further teaches the graphics data associated with the particular item is for rendering the particular item with an enhanced image quality (see Fig. 3, Fig. 5 para. [0050]-[0052], para. [0080]-[0096], and para. [0104]-[0108]. Immersive video content can be transmitted from a content source (e.g., a video content server) to a display device (e.g., a wearable display device) according to a view-adaptive prefetching technique. A first set of data can represent the video content according to a low level of detail, such that the data can be transmitted and/or presented using a smaller amount of computational and network resources. Further, additional sets of data can represent the same video content according to progressively higher level of detail, such that the video content can be presented with a higher quality level (e.g., with the trade-off that a larger amount of computation and network resources may be expended). Additional sets of data 304a-304n can represent the same portion of the video content according to progressively higher levels of detail, such that the video content can be presented with a higher quality level. As example, each of the additional sets of data 304a can include data presenting the same portion of the video content as that of the first set of data 302 (e.g., the portion of the video content that is intended to be presented to a user at a display time T). As shown in FIG. 3, at a time t+m subsequent to the time t and prior to the display time T, a portion of the additional set of data 304a can be streamed from the video content source 104 to the wearable display device 106 and stored in the data buffer 122. The portion of the additional set of data 304a that is streamed and stored in the data buffer 122 can be selected by predicting the viewport 306 that will be used to present the video content to the user at the display time T. For example, if the predicted viewport 306 at the display time T indicates that a particular region of the video content is expected to be in the user's field of view at the display time T, the portion of the additional set of data 304a corresponding to that region can be selectively streamed and stored in the data buffer 122. The portion of the additional set of data 304n that is streamed and stored in the data buffer 122 also can be selected based the predicted viewport 306 at the display time T. For example, if the predicted viewport 306 at the display time T indicates that a particular region of the video content is expected to be in the user's field of view at the display time T, the portion of the additional set of data 304n corresponding to that region can be selectively streamed and stored in the data buffer 122. As shown in FIG. 5, at one or more times subsequent to the time t and prior to the display time T (e.g., at a time t+m), portions of one or more of the point clouds 400b-400d can be streamed from the video content source 104 to the wearable display device 106 and stored in the data buffer 122. In some implementations, each of the portions of the point clouds 400b-400d that are streamed and stored in the data buffer 122 can include sufficient information to display particular portions of the object at the display time T, according to levels of detail greater than the default level of detail. The composite point cloud 504 can include portions that enable certain other regions of the object to be presented according to higher levels of detail (e.g., the regions corresponding to the portions of the point clouds 400b-400d stored in the data buffer 122). If the actual viewport at the display time T coincides with the predicted viewport 502 (e.g., the viewing perspective of the user at the display time T was accurately predicted), the region of the object within the actual viewport can be presented to the user according to a level of detail that is higher than the default level of detail)
Miyaki and Duanmu are related to display devices, thus one of ordinary skill in the art, before the effective filing date of the claimed invention, would have recognized the obviousness of modifying the system disclosed by Miyaki with Duanmu’s teachings, since it would have been beneficial by enabling the display device to present video content corresponding to the predicted viewing perspective according to a higher level of detail in some situations (e.g., if the viewing perspective of the user at the display time coincides with the predicted viewing perspective), while also enabling the display device to present video content corresponding to any other viewing perspective according to a lower level of detail in other situations (e.g., if the viewing perspective of the user at the display device does not coincide with the predicted viewing perspective, and/or the performance of the network is degraded). Accordingly, the presentation of video content remains uninterrupted, even if the user's behavior and/or inputs are different than expected (see Duanmu para. [0051] and para. [0091]).
Miyaki and Duanmu do not explicitly teach the graphics data includes data related to at least one of coarseness, curvature, geometry, vertices, depth, color, lighting, shading, texturing, or motion.
However, Gustafsson teaches the graphics data includes data related to at least one of coarseness, curvature, geometry, vertices, depth, color, lighting, shading, texturing, or motion (see para. [0156]-[0175]. Increasing the quality of the image may include increasing the quality of any one or more of the below non-exclusive list of graphical characteristics, in addition to other possible characteristics known in the art: Shading: Variation of the color and brightness of graphical objects dependent on the artificial lighting projected by light sources emulated by graphics processing device 130;Texture-mapping: The mapping of graphical images or “textures” onto graphical objects to provide the objects with a particular look; Bump-mapping: Simulation of small-scale bumps and rough gradients on surfaces of graphical objects; Fogging/participating medium: The dimming of light when passing through non-clear atmosphere or air; Shadows: Emulation of obstruction of light; Soft shadows: Variance in shadowing and darkness caused by partially obscured light sources; Reflection: Representations of mirror-like or high gloss reflective surfaces; Transparency/opacity (optical or graphic); Sharp transmission of light through solid objects; Translucency: Highly scattered transmission of light through solid objects; Refraction: Bending of light associated with transparency; Diffraction: Bending, spreading and interference of light passing by an object or aperture that disrupts the light ray; Indirect illumination: Surfaces illuminated by light reflected off other surfaces, rather than directly from a light source (also known as global illumination); Caustics (a form of indirect illumination): Reflection of light off a shiny object, or focusing of light through a transparent object, to produce bright highlights on another object; Anti-aliasing: The process of blending the edge of a displayed object to reduce the appearance of sharpness or jagged lines. Typically an algorithm is used that samples colors around the edge of the displayed object in to blend the edge to its surroundings).
Miyaki, Duanmu and Gustafsson are related to display devices, thus one of ordinary skill in the art, before the effective filing date of the claimed invention, would have recognized the obviousness of modifying the method disclosed by Miyaki and Duanmu with Gustafsson’s teachings, since the use of available resources of graphics processing device, and/or other system resources, are maximized to deliver image quality where it matters most on display device (Gustafsson, para. [0156]).
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b).
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
Claims 2-22 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 6 and 8 of U.S. Patent No. 12189841 B2 in view of Miyaki (US 20190250773 A1).
Current Application No. 19/009,369
U.S. Patent No. 12189841 B2
Claim 2. A computer-implemented method comprising:
determining a gesture of a user that is interacting with a scene of a computer game;
identifying a particular item within the scene of the computer game based on the gesture of the user;
pre-fetching graphics data associated with the particular item within the scene that was identified based on the gesture of the user for rendering the particular item with an enhanced image quality; and after the graphics data is pre-fetched, providing, for output to the user, the particular item within the scene, rendered with the enhanced image quality.
Claim 3. The method of claim 2, wherein the gesture is other than a gaze of the user.
Claim 4. The method of claim 2, wherein the gesture comprises a head movement of the user.
Claim 5. The method of claim 2, wherein the gesture comprises a hand movement of the user.
Claim 6. The method of claim 2, wherein the gesture comprises a body movement of the user.
Claim 7. The method of claim 2, wherein the gesture comprises a body language signal of the user.
Claim 8. The method of claim 2, wherein the graphics data associated with the particular item within the scene is pre-fetched before the user interacts with the particular item within the scene.
Claim 9. A non-transitory computer-readable medium that store instructions which, when executed by a computer processor, cause the computer processor to perform operations comprising:
determining a gesture of a user that is interacting with a scene of a computer game;
identifying a particular item within the scene of the computer game based on the gesture of the user; and
pre-fetching graphics data associated with the particular item within the scene that was identified based on the gesture of the user for rendering the particular item with an enhanced image quality; and after the graphics data is pre-fetched, providing, for output to the user, the particular item within the scene, rendered with the enhanced image quality.
Claim 10. The non-transitory computer-readable medium of claim 9, wherein the gesture is other than a gaze of the user.
Claim 11. The non-transitory computer-readable medium of claim 9, wherein the gesture comprises a head movement of the user.
Claim 12. The non-transitory computer-readable medium of claim 9, wherein the gesture comprises a hand movement of the user.
Claim 13. non-transitory computer-readable The medium of claim 9, wherein the gesture comprises a body movement of the user.
Claim 14. The non-transitory computer-readable medium of claim 9, wherein the gesture comprises a body language signal of the user.
Claim 15. The non-transitory computer-readable medium of claim 9, wherein the graphics data associated with the particular item within the scene is pre-fetched before the user interacts with the particular item within the scene.
Claim 16. A system comprising:
a computer processor; and
a non-transitory computer-readable medium that store instructions which, when executed by the computer processor, cause the computer processor to perform operations comprising:
determining a gesture of a user that is interacting with a scene of a computer game;
identifying a particular item within the scene of the computer game based on the gesture of the user; and
pre-fetching graphics data associated with the particular item within the scene that was identified based on the gesture of the user for rendering the particular item with an enhanced image quality; and after the graphics data is pre-fetched, providing, for output to the user, the particular item within the scene, rendered with the enhanced image quality.
Claim 17. The system of claim 16, wherein the gesture is other than a gaze of the user.
Claim 18. The system of claim 16, wherein the gesture comprises a head movement of the user.
Claim 19. The system of claim 16, wherein the gesture comprises a hand movement of the user.
Claim 20. The system of claim 16, wherein the gesture comprises a body movement of the user.
Claim 21. The system of claim 16, wherein the gesture comprises a body language signal of the user.
Claim 22. The method of claim 2, wherein graphics data associated with the particular item for rendering the particular item with an enhanced image quality includes data related to at least one of coarseness, curvature, geometry, vertices, depth, color, lighting, shading, texturing, or motion.
Claim 1. A method for fetching graphics data for rendering a scene presented on a display, comprising:
receiving gaze information for eyes of a user while the user is interacting with the scene;
tracking gestures of the user while the user is interacting with the scene;
identifying a content item in the scene as being a potential focus of interactivity by the user, the content item is a virtual object that is configured for interaction in the scene, wherein said identifying the content item in the scene is assisted by a machine learning model;
processing the gaze information and the gestures of the user to generate a prediction of interaction with the content item by the user, the prediction of interaction uses a behavior model and said machine learning model to identify relationships between the gaze information of the user while the user is interacting in the scene and the gestures of the user while the user is interacting in the scene to predict a likelihood of the user interacting with said content item;
processing a pre-fetching operation to access and load the graphics data associated with said content item into system memory based on said predicted likelihood in anticipation of the user interacting with the content item; and rendering the content item at an increased image quality in the scene using said pre-fetched graphics data in anticipation of the user interacting with the content item.
Claim 6. The method of claim 1, wherein the gestures of the user is head movement, hand movement, body movement, position of the user relative to the content item, body language signals, or a combination of two or more thereof.
Claim 6. The method of claim 1, wherein the gestures of the user is head movement, hand movement, body movement, position of the user relative to the content item, body language signals, or a combination of two or more thereof.
Claim 6. The method of claim 1, wherein the gestures of the user is head movement, hand movement, body movement, position of the user relative to the content item, body language signals, or a combination of two or more thereof.
Claim 6. The method of claim 1, wherein the gestures of the user is head movement, hand movement, body movement, position of the user relative to the content item, body language signals, or a combination of two or more thereof.
Claim 6. The method of claim 1, wherein the gestures of the user is head movement, hand movement, body movement, position of the user relative to the content item, body language signals, or a combination of two or more thereof.
Claim 1…. identifying a content item in the scene as being a potential focus of interactivity by the user, the content item is a virtual object that is configured for interaction in the scene, …processing a pre-fetching operation to access and load the graphics data associated with said content item into system memory based on said predicted likelihood in anticipation of the user interacting with the content item.
Claim 1. A method for fetching graphics data for rendering a scene presented on a display, comprising:
receiving gaze information for eyes of a user while the user is interacting with the scene;
tracking gestures of the user while the user is interacting with the scene;
identifying a content item in the scene as being a potential focus of interactivity by the user, the content item is a virtual object that is configured for interaction in the scene, wherein said identifying the content item in the scene is assisted by a machine learning model;
processing the gaze information and the gestures of the user to generate a prediction of interaction with the content item by the user, the prediction of interaction uses a behavior model and said machine learning model to identify relationships between the gaze information of the user while the user is interacting in the scene and the gestures of the user while the user is interacting in the scene to predict a likelihood of the user interacting with said content item;
processing a pre-fetching operation to access and load the graphics data associated with said content item into system memory based on said predicted likelihood in anticipation of the user interacting with the content item; and rendering the content item at an increased image quality in the scene using said pre-fetched graphics data in anticipation of the user interacting with the content item.
Claim 6. The method of claim 1, wherein the gestures of the user is head movement, hand movement, body movement, position of the user relative to the content item, body language signals, or a combination of two or more thereof.
Claim 6. The method of claim 1, wherein the gestures of the user is head movement, hand movement, body movement, position of the user relative to the content item, body language signals, or a combination of two or more thereof.
Claim 6. The method of claim 1, wherein the gestures of the user is head movement, hand movement, body movement, position of the user relative to the content item, body language signals, or a combination of two or more thereof.
Claim 6. The method of claim 1, wherein the gestures of the user is head movement, hand movement, body movement, position of the user relative to the content item, body language signals, or a combination of two or more thereof.
Claim 6. The method of claim 1, wherein the gestures of the user is head movement, hand movement, body movement, position of the user relative to the content item, body language signals, or a combination of two or more thereof.
Claim 1…. identifying a content item in the scene as being a potential focus of interactivity by the user, the content item is a virtual object that is configured for interaction in the scene, …processing a pre-fetching operation to access and load the graphics data associated with said content item into system memory based on said predicted likelihood in anticipation of the user interacting with the content item.
Claim 1. A method for fetching graphics data for rendering a scene presented on a display, comprising:
receiving gaze information for eyes of a user while the user is interacting with the scene;
tracking gestures of the user while the user is interacting with the scene;
identifying a content item in the scene as being a potential focus of interactivity by the user, the content item is a virtual object that is configured for interaction in the scene, wherein said identifying the content item in the scene is assisted by a machine learning model;
processing the gaze information and the gestures of the user to generate a prediction of interaction with the content item by the user, the prediction of interaction uses a behavior model and said machine learning model to identify relationships between the gaze information of the user while the user is interacting in the scene and the gestures of the user while the user is interacting in the scene to predict a likelihood of the user interacting with said content item;
processing a pre-fetching operation to access and load the graphics data associated with said content item into system memory based on said predicted likelihood in anticipation of the user interacting with the content item; and rendering the content item at an increased image quality in the scene using said pre-fetched graphics data in anticipation of the user interacting with the content item.
Claim 6. The method of claim 1, wherein the gestures of the user is head movement, hand movement, body movement, position of the user relative to the content item, body language signals, or a combination of two or more thereof.
Claim 6. The method of claim 1, wherein the gestures of the user is head movement, hand movement, body movement, position of the user relative to the content item, body language signals, or a combination of two or more thereof.
Claim 6. The method of claim 1, wherein the gestures of the user is head movement, hand movement, body movement, position of the user relative to the content item, body language signals, or a combination of two or more thereof.
Claim 6. The method of claim 1, wherein the gestures of the user is head movement, hand movement, body movement, position of the user relative to the content item, body language signals, or a combination of two or more thereof.
Claim 6. The method of claim 1, wherein the gestures of the user is head movement, hand movement, body movement, position of the user relative to the content item, body language signals, or a combination of two or more thereof.
Claim 8. The method of claim 1, wherein the pre-fetched graphics data include information related to the identified content item, said information include coarseness, curvature, geometry, vertices, depth, color, lighting, shading, texturing, or a combination of two or more thereof.
The claims of U.S. Patent No. 12189841 B2 do not explicitly disclose the scene is a scene of the computer game; a system comprising: a computer processor; and a non-transitory computer-readable medium that store instructions which, when executed by the computer processor, cause the computer processor to perform operations.
However, Miyaki teaches the scene is a scene of a computer game (see Fig. 3A, Fig. 4, para. [0025]-[0026], para. [0048]-[0059], A user wearing a head mounted display (HMD) accesses a local server or a remote server on a cloud system, for example, through a user account and selects an interactive application, such as a video game. In response to the request, the server identifies a virtual scene of the interactive application and provides content of the virtual scene for rendering on a display screen of the HMD. FIGS. 3A-3D illustrate some examples of visual options available in virtual scenes of an interactive application for accessing other virtual scenes); a system (see abstract, para. [0158]-[0161]. The invention may employ various computer-implemented operations involving data stored in computer systems) comprising: a computer processor (see para. [0159]-[0161]. The apparatus can be specially constructed for the required purpose, or the apparatus can be a general-purpose computer selectively activated or configured by a computer program stored in the computer. In particular, various general-purpose machines can be used with computer programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations. The above described invention may be practiced with other computer system configurations including hand-held devices, microprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers and the like); and a non-transitory computer-readable medium that store instructions (see para. [0159]-[0161]. Computer readable code on a computer readable medium. Examples of the computer readable medium include hard drives, network attached storage (NAS), read-only memory, random-access memory, CD-ROMs, CD-Rs, CD-RWs, magnetic tapes, and other optical and non-optical data storage devices) which, when executed by the computer processor, cause the computer processor to perform operations (see para. [0158]-[0161]. The apparatus can be specially constructed for the required purpose, or the apparatus can be a general-purpose computer selectively activated or configured by a computer program stored in the computer. In particular, various general-purpose machines can be used with computer programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations. The above described invention may be practiced with other computer system configurations including hand-held devices, microprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers and the like) comprising:
determining a gesture of a user that is interacting with a scene of a computer game (see Fig. 3A, Fig. 4, para. [0049], para. [0052], para. [0058]. The predictive interactions provided by the user in the virtual scene include movement of an avatar or user representation of a user within the virtual scene, direction of motion, extension of a hand of the user into the virtual scene that can be determined by tracking a wearable device, a trigger of a controller, etc. The user may or may not be wearing a wearable device that can be tracked. The interaction predictor 119 may track the user's hand using sensors built in the HMD or using sensors that are external to the HMD or by tracking a wearable device worn on the hand of the user).
U.S. Patent No. 12189841 B2 and Miyaki are related to display systems that pre-load content, thus one of ordinary skill in the art, before the effective filing date of the claimed invention, would have recognized the obviousness of modifying the claims of U.S. Patent No. 12189841 B2 with Miyaki’s teachings of providing a non-transitory computer-readable medium that store instructions that are executed by a computer processor, since it would provided the necessary structure to perform the functions. In addition, selecting the scene to be a computer game scene would have been obvious to try from a finite number of options known in the art that would have yield the same predictable result of pre-loading content.
Allowable Subject Matter
Claim 23 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20240028112 A1 – Hashimoto et al. – Method for displaying an image content of a vicinity of an observation point that a user is interested with a high-quality content level while while the vicinity of the observation point is displayed information display at a low content level.
US 20180292896 A1 – Hicks et al. – A rendering engine renders content on a display. Content at the focus plane is rendered at a higher resolution of the display and content not at the focus plane is rendered at a lower resolution for the display.
US 20180275410 A1 – Yeon et al. – Method for adjusting display displaying resolutions. The resolution of a virtual objects is adjusted based on a proximity of the virtual object to a fixation point. Wherein the resolution decreases with increasing distance from the fixation point.
US 20180096461 A1 – Okayama et al. – Method for rendering High image quality area based on gaze point (see Figs. 5 and 8).
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to IVELISSE MARTINEZ QUILES whose telephone number is (571)270-7618. The examiner can normally be reached Monday thru Friday; 1:00 PM to 5:00 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Temesghen Ghebretinsae can be reached at 571-272-3017. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/IM/Examiner, Art Unit 2626
/TEMESGHEN GHEBRETINSAE/Supervisory Patent Examiner, Art Unit 2626 3/23/26