DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Receipt is acknowledged of certified copies of papers submitted under 35 U.S.C. 119(a)-(d), which papers have been placed of record in the file.
Response to Preliminary Amendment
The preliminary amendment filed on April 04, 2024 has been entered.
In view of the amendment to the specification, paragraph of “CROSS REFERENCE TO RELATED APPLOCATIONS” has been added.
In view of the amendment to the claims, the amendment of claims 1-8, 10-17, 19 and 21 have been acknowledged. Claims 9, 18 and 20 have been canceled.
The preliminary amendment filed on April 10, 2024 has been entered.
In view of the amendment to the specification, Abstract has been acknowledged.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 19 and 21 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.
Claim 19 is directed to a non-transitory computer-readable medium, which is one of the statutory categories of invention. The claim recites “a description of the augmented reality scene, the description comprising: a scene graph; and one or more anchors, wherein each anchor of the one or more anchors is associated with one or more nodes of the scene graph”. The limitations describe a structure of a scene and merely employ mathematical relationships/formulas between “a scene graph”, “anchors” and “nodes” (MPEP 2106.04 (a)(2). Another example is Digitech Image Techs., LLC v. Electronics for Imaging, Inc., 758 F.3d 1344, 111 USPQ2d 1717 (Fed. Cir. 2014). The patentee in Digitech claimed methods of generating first and second data by taking existing information, manipulating the data using mathematical formulas, and organizing this information into a new form. The court explained that such claims were directed to an abstract idea because they described a process of organizing information through mathematical correlations, like Flook's method of calculating using a mathematical formula. 758 F.3d at 1350, 111 USPQ2d at 1721). The grouping of “mathematical concepts” in the 2019 PEG is not limited to formulas or equations, and in fact specifically includes “mathematical calculations” as an exemplar of a mathematical concept. 2019 PEG Section I, 84 Fed. Reg. at 52. Thus, the limitations recites a concept that falls into the “mathematical concept” group of abstract ideas.
Next, the claim recites the additional limitations of “each anchor of the one or more anchors is associated with one or more nodes of the scene graph and comprises: a trigger, wherein the trigger is a description of at least one condition; wherein the at least one condition is a detection of a visual or audio or environment-based marker or property, and wherein the trigger is activated when a condition of the at least one condition is detected in a real environment; and an action, wherein the action comprises a description of process to be performed by an augmented reality engine; and media content items linked to the nodes of the scene graph”. Those limitations provide descriptions or definitions for each anchor. Therefore, the claim does not include additional elements providing meaningful limitation(s) to transform the abstract idea into a patent eligible application of the abstract idea such that the claim(s) amounts to significantly more than the abstract idea itself. Accordingly, the claim is not patent eligible.
Claim 21 depends from claim 19 and recites additional limitations “wherein the trigger relies on a detection of an object in the real environment and wherein the trigger is associated with a model of the object or with a semantic description of the object”. “the trigger” is simple describe “mathematical calculations being performed on the object in the real environment” and “mathematical relationship” with “a model” or “a semantic description”. Thus, the limitation describes a “mathematical relationship,” which falls into the “mathematical concepts” grouping of abstract ideas. For the same reasons stated previously with respect to claim 19, the limitations do not integrate the recited judicial exception into a practical application. Therefore, the claim does not amount to significantly more than the abstract idea itself. The claim is not patent eligible.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-2, 6-8, 10-11, 15-17, 19 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Bouazizi et al (U.S. Patent Application Publication 2022/0335694 A1).
Regarding claim 1, Bouazizi discloses a method for rendering an augmented reality scene for a user in a real environment (Paragraph [0006], this disclosure describes techniques for streaming immersive media content, e.g., for extended reality (XR) content, such as augmented reality (AR) ... For example, the user may be able to navigate the virtual scene using controllers and/or real-world movement. In order to allow for proper movement in a real-world presentation environment ...), the method comprising:
obtaining a description of the augmented reality scene (FIGS. 10 and 11; paragraph [0156], client device 300 may receive a bitstream including a scene description (350); paragraph [0134], according to the techniques of this disclosure, an MPEG_scene_anchor extension may be added as a glTF 2.0 extension to the scene node. An AR scene may contain the MPEG_scene_anchor extension to describe the anchoring of the scene to a real-world XR space), the description comprising:
a scene graph (Paragraph [0156], the scene description may include a scene graph ... For example, the scene description may correspond to the example scene graph of FIG. 5, scene description and updates 200 of FIG. 6, scene graph 262 and scene graph updates 264 of FIG. 8, or scene graph 324 of FIG. 10); and
one or more anchors, wherein each anchor of the one or more anchors is associated with one or more nodes of the scene graph (Paragraphs [0092]-[0094], FIG. 5 shows a scene graph ... Each node in the graph holds pointers to its children. The child nodes can, among others, be a group of other nodes, a geometry element, a transformation matrix, accessors to media data buffers, camera information for the rendering ... Spatial transformations are represented as nodes of the graph and represented by a transformation matrix. Typical usage of transform nodes is to describe rotation, translation or scaling of the objects in its child nodes ...; FIG. 1; paragraph [0068], client device 40 may be configured to perform the various techniques of this disclosure alone or in any combination. In general, retrieval unit 52 may be configured to retrieve a bitstream including media data (e.g., scene data), as discussed above, as well as a scene description. The scene description may include anchor point data representing a correspondence between a virtual scene represented by the media data and a real-world presentation environment. Client device 40 may be configured to anchor the virtual scene to the real-world presentation environment using the anchor point data, and also transform the virtual scene as needed, e.g., through rotation, translation, and/or scaling) and comprises:
a trigger, wherein the trigger is a description of at least one condition (Paragraph [0157], the scene description may include data for an anchor point, such as the MPEG_scene_anchor as discussed above with respect to Table 1. The data for the anchor point may include, for example, data indicating whether the reference space type (e.g., a real-world presentation environment) is to be a view, local, stage, or application type of reference space); wherein the at least one condition is a detection of a visual or audio or environment-based marker or property (Paragraph [0158], presentation unit 330 may automatically detect the real-world presentation environment using camera 308. In some examples, client device 300 may receive image and/or video data captured by camera 308 and upload this data via a sceneUnderstandingStream as indicated by the scene description), and wherein the trigger is activated when a condition of the at least one condition is detected in the real environment (Paragraph [0157], the data for the anchor point may further indicate whether the scene data is to be transformed (e.g., rotated, translated, and/or scaled) to match the real-world presentation environment ... In general, the scene description may include data that relates the scene anchor point to a real-world anchor point, such as a particular location on the floor (e.g., a midpoint of the floor)); and
an action, wherein the action comprises a description of a process to be performed by an augmented reality engine (Paragraph [0157], the anchor point may further include various actions that a user may perform, such as movements received via a controller or real-world repositioning of a device worn by the user);
observing the augmented reality scene (Paragraph [0159], presentation unit 330 may anchor the virtual scene to the real-world presentation environment at the determined real-world anchor point (358). Presentation unit 330 may also transform the virtual scene to align with the real-world presentation environment (360), e.g., using rotation, translation, and/or scaling; paragraphs [0130]-[0133], FIG. 9 is a conceptual diagram illustrating an example anchor XR space indicated by a scene description. According to the techniques of this disclosure, a scene description node may contain a reference to an XR space, which indicates that the scene is anchored to that space. The anchor XR space may be a reference space, e.g., local, view, or stage ... XR runtime systems, such as OpenXR, allow querying of the bounding space for an XR space. The scene description may request that the presentation engine aligns the scene extents, i.e. the bounding box of the scene, to the bounding box of the anchor XR space ... In the example of FIG. 9, the anchor XR space is of type “stage,” corresponding to the floor of the viewer's living room);
on condition that the trigger of one of the one anchor or more anchors is activated (Paragraph [0157], presentation unit 330 of client device 300 may determine an anchor point from the scene description (352). According to the techniques of this disclosure, the scene description may include data for an anchor point, such as the MPEG_scene_anchor as discussed above with respect to Table 1. The data for the anchor point may include, for example, data indicating whether the reference space type (e.g., a real-world presentation environment) is to be a view, local, stage, or application type of reference space; paragraph [0133], in the example of FIG. 9, the anchor XR space is of type “stage,” corresponding to the floor of the viewer's living room), applying the action of the one anchor of the one or more anchors to the one or more nodes associated with the one anchor of the one or more anchors (Paragraph [0159], presentation unit 330 may anchor the virtual scene to the real-world presentation environment at the determined real-world anchor point (358). Presentation unit 330 may also transform the virtual scene to align with the real-world presentation environment (360), e.g., using rotation, translation, and/or scaling); and
loading one or more media content items linked to at least one of the one or more nodes (Paragraph [0151], after aligning the virtual scene anchor point with the real-world anchor point and making any necessary transformations, presentation unit 330 may present media data 322 via display 314. For example, media data 322 may include data defining virtual objects, textures, colors, and locations for the virtual objects).
It's noted that Bouazizi does not use term of “a trigger” by the techniques described in the disclosure. However, the claim recites “a trigger, wherein the trigger is a description of at least one condition”. Paragraph [0157] of Bouazizi describes “The data for the anchor point may include, for example, data indicating whether the reference space type (e.g., a real-world presentation environment) is to be a view, local, stage, or application type of reference space” and the data describes at least one type “a view, local, stage, or application” of reference space. Thus, the “type” described by Bouazizi can be considered equivalent to a “condition”. Accordingly, under broadest reasonable interpretation, “the data” described by Bouazizi can be considered equivalent to “a trigger” recited in the claim. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to understand Bouazizi disclose the invention as specified in claim.
Regarding claim 2, Bouazizi discloses everything claimed as applied above (see claim 1), and Bouazizi further disclose wherein the trigger of one of the one or more anchors comprises one or more limit conditions (FIGS. 10 and 11; paragraph [0157], the scene description may include data for an anchor point, such as the MPEG_scene_anchor as discussed above with respect to Table 1. The data for the anchor point may include, for example, data indicating whether the reference space type (e.g., a real-world presentation environment) is to be a view, local, stage, or application type of reference space) and wherein the media content items are linked to the at least one of the one or more nodes and loaded only (Paragraph [0151], after aligning the virtual scene anchor point with the real-world anchor point and making any necessary transformations, presentation unit 330 may present media data 322 via display 314. For example, media data 322 may include data defining virtual objects, textures, colors, and locations for the virtual objects) when the one or more limit conditions are observed in the augmented reality scene (Paragraph [0133], in the example of FIG. 9, the anchor XR space is of type “stage,” corresponding to the floor of the viewer's living room).
Regarding claim 6, Bouazizi discloses everything claimed as applied above (see claim 1), and Bouazizi further disclose wherein the at least one condition is a member of a group of conditions comprising detection of:
one or more visual 2D markers, one or more visual 3D markers, one or more visual signatures, one or more visual geometric properties, one or more visual semantic properties, one or more audio markers, one or more audio properties, one or more temperature conditions, one or more movement of real or virtual objects, one or more hygrometry conditions, one or more lighting changes, and one or more wind conditions (FIGS. 10 and 11; paragraph [0148], anchor point detection unit 332 may determine a type of anchor point to be used from the data of scene graph 324, and identify a corresponding real-world anchor point using image data from camera 308. Presentation unit 330 may use the identified anchor point in the real-world presentation environment to align a virtual scene with the real-world presentation environment. For example, the real-world anchor point may be a point on the floor, a surface (e.g., a table), or the like. In some examples, a visual marker on the real-world object may be used, such as a quick response (QR) code, to represent the real-world anchor point).
Regarding claim 7, Bouazizi discloses everything claimed as applied above (see claim 1), and Bouazizi further disclose the trigger of one of the one or more anchors relies on a detection of an object in the real environment (FIGS. 10 and 11; paragraph [0146], camera 308 represents a camera used to capture images or video data of the real-world presentation environment ... Additionally or alternatively, camera 308 may detect real-world objects, such as the floor ...) and wherein the trigger is associated with a model of the object or with a semantic description of the object (Paragraph [0157], the scene description may include data for an anchor point, such as the MPEG_scene_anchor as discussed above with respect to Table 1. The data for the anchor point may include, for example, data indicating whether the reference space type (e.g., a real-world presentation environment) is to be a view, local, stage, or application type of reference space; paragraphs [0132]-[0133], the scene description may request that the presentation engine aligns the scene extents, i.e. the bounding box of the scene, to the bounding box of the anchor XR space ... If no scaling is applied, the presentation engine aligns the long edge of the scene bounding box to that of the XR space and then centers the scene bounding box to be collocated with the center of the XR space bounding box ... In the example of FIG. 9, the anchor XR space is of type “stage,” corresponding to the floor of the viewer's living room).
Regarding claim 8, Bouazizi discloses everything claimed as applied above (see claim 1), and Bouazizi further disclose wherein the action of one of the one or more anchors is a member of a group of actions comprising:
playing, pausing or stopping a media content item;
modifying the description of the augmented reality scene (FIGS. 10 and 11; paragraph [0151], after aligning the virtual scene anchor point with the real-world anchor point and making any necessary transformations, presentation unit 330 may present media data 322 via display 314 ... Presentation unit 330 may update the presentation according to user movements detected from user interface devices 306, camera 308, and/or sensors 310, and/or based on updated to scene graph 324); and
connecting a remote device or service.
Regarding claim 10, Bouazizi discloses a device for rendering an augmented reality scene for a user in a real environment (Paragraph [0006], this disclosure describes techniques for streaming immersive media content, e.g., for extended reality (XR) content, such as augmented reality (AR) ... For example, the user may be able to navigate the virtual scene using controllers and/or real-world movement. In order to allow for proper movement in a real-world presentation environment ...), the device comprising a memory associated with a processor (Paragraph [0008], a device for presenting media data includes a memory configured to store media data defining one or more virtual objects in a virtual scene; and one or more processors implemented in circuitry and configured to ...) configured for:
obtaining a description of the augmented reality scene (FIGS. 10 and 11; paragraph [0156], client device 300 may receive a bitstream including a scene description (350); paragraph [0134], according to the techniques of this disclosure, an MPEG_scene_anchor extension may be added as a glTF 2.0 extension to the scene node. An AR scene may contain the MPEG_scene_anchor extension to describe the anchoring of the scene to a real-world XR space), the description comprising:
a scene graph (Paragraph [0156], he scene description may include a scene graph ... For example, the scene description may correspond to the example scene graph of FIG. 5, scene description and updates 200 of FIG. 6, scene graph 262 and scene graph updates 264 of FIG. 8, or scene graph 324 of FIG. 10); and
one or more anchors, wherein each anchor of the one or more anchors is associated with one or more nodes of the scene graph (Paragraphs [0092]-[0094], FIG. 5 shows a scene graph ... Each node in the graph holds pointers to its children. The child nodes can, among others, be a group of other nodes, a geometry element, a transformation matrix, accessors to media data buffers, camera information for the rendering ... Spatial transformations are represented as nodes of the graph and represented by a transformation matrix. Typical usage of transform nodes is to describe rotation, translation or scaling of the objects in its child nodes ...; FIG. 1; paragraph [0068], client device 40 may be configured to perform the various techniques of this disclosure alone or in any combination. In general, retrieval unit 52 may be configured to retrieve a bitstream including media data (e.g., scene data), as discussed above, as well as a scene description. The scene description may include anchor point data representing a correspondence between a virtual scene represented by the media data and a real-world presentation environment. Client device 40 may be configured to anchor the virtual scene to the real-world presentation environment using the anchor point data, and also transform the virtual scene as needed, e.g., through rotation, translation, and/or scaling) and comprises:
a trigger, wherein the trigger is a description of at least one condition (Paragraph [0157], the scene description may include data for an anchor point, such as the MPEG_scene_anchor as discussed above with respect to Table 1. The data for the anchor point may include, for example, data indicating whether the reference space type (e.g., a real-world presentation environment) is to be a view, local, stage, or application type of reference space); wherein the at least one condition is a detection of a visual or audio or environment-based marker or property (Paragraph [0158], presentation unit 330 may automatically detect the real-world presentation environment using camera 308. In some examples, client device 300 may receive image and/or video data captured by camera 308 and upload this data via a sceneUnderstandingStream as indicated by the scene description), and wherein the trigger is activated when a condition of the at least one condition is detected in the real environment (Paragraph [0157], the data for the anchor point may further indicate whether the scene data is to be transformed (e.g., rotated, translated, and/or scaled) to match the real-world presentation environment ... In general, the scene description may include data that relates the scene anchor point to a real-world anchor point, such as a particular location on the floor (e.g., a midpoint of the floor)); and
an action, wherein the action comprises a description of a process to be performed by an augmented reality engine (Paragraph [0157], the anchor point may further include various actions that a user may perform, such as movements received via a controller or real-world repositioning of a device worn by the user);
observing the augmented reality scene (Paragraph [0159], presentation unit 330 may anchor the virtual scene to the real-world presentation environment at the determined real-world anchor point (358). Presentation unit 330 may also transform the virtual scene to align with the real-world presentation environment (360), e.g., using rotation, translation, and/or scaling; paragraphs [0130]-[0133], FIG. 9 is a conceptual diagram illustrating an example anchor XR space indicated by a scene description. According to the techniques of this disclosure, a scene description node may contain a reference to an XR space, which indicates that the scene is anchored to that space. The anchor XR space may be a reference space, e.g., local, view, or stage ... XR runtime systems, such as OpenXR, allow querying of the bounding space for an XR space. The scene description may request that the presentation engine aligns the scene extents, i.e. the bounding box of the scene, to the bounding box of the anchor XR space ... In the example of FIG. 9, the anchor XR space is of type “stage,” corresponding to the floor of the viewer's living room); and
on condition that the trigger of one anchor of the one or more anchors is activated (Paragraph [0157], presentation unit 330 of client device 300 may determine an anchor point from the scene description (352). According to the techniques of this disclosure, the scene description may include data for an anchor point, such as the MPEG_scene_anchor as discussed above with respect to Table 1. The data for the anchor point may include, for example, data indicating whether the reference space type (e.g., a real-world presentation environment) is to be a view, local, stage, or application type of reference space; paragraph [0133], in the example of FIG. 9, the anchor XR space is of type “stage,” corresponding to the floor of the viewer's living room), applying the action of the one anchor of the one or more anchors to the one or more nodes associated with the one anchor of the one or more anchors (Paragraph [0159], presentation unit 330 may anchor the virtual scene to the real-world presentation environment at the determined real-world anchor point (358). Presentation unit 330 may also transform the virtual scene to align with the real-world presentation environment (360), e.g., using rotation, translation, and/or scaling); and
loading a part of media content items being linked to the nodes of the scene graph (Paragraph [0151], after aligning the virtual scene anchor point with the real-world anchor point and making any necessary transformations, presentation unit 330 may present media data 322 via display 314. For example, media data 322 may include data defining virtual objects, textures, colors, and locations for the virtual objects).
It's noted that Bouazizi does not use term of “a trigger” by the techniques described in the disclosure. However, the claim recites “a trigger, wherein the trigger is a description of at least one condition”. Paragraph [0157] of Bouazizi describes “The data for the anchor point may include, for example, data indicating whether the reference space type (e.g., a real-world presentation environment) is to be a view, local, stage, or application type of reference space” and the data describes at least one type “a view, local, stage, or application” of reference space. Thus, the “type” described by Bouazizi can be considered equivalent to a “condition”. Accordingly, under broadest reasonable interpretation, “the data” described by Bouazizi can be considered equivalent to “a trigger” recited in the claim. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to understand Bouazizi disclose the invention as specified in claim.
Regarding claim 11, Bouazizi discloses everything claimed as applied above (see claim 10), and Bouazizi further disclose wherein the trigger of one of the one or more anchors comprises one or more limit conditions (FIGS. 10 and 11; paragraph [0157], the scene description may include data for an anchor point, such as the MPEG_scene_anchor as discussed above with respect to Table 1. The data for the anchor point may include, for example, data indicating whether the reference space type (e.g., a real-world presentation environment) is to be a view, local, stage, or application type of reference space) and wherein the media content items are linked to the one or more nodes and loaded only (Paragraph [0151], after aligning the virtual scene anchor point with the real-world anchor point and making any necessary transformations, presentation unit 330 may present media data 322 via display 314. For example, media data 322 may include data defining virtual objects, textures, colors, and locations for the virtual objects) when the one or more limit conditions are observed in the augmented reality scene (Paragraph [0133], in the example of FIG. 9, the anchor XR space is of type “stage,” corresponding to the floor of the viewer's living room).
Regarding claim 15, Bouazizi discloses everything claimed as applied above (see claim 10), and Bouazizi further disclose wherein the at least one condition is a member of a group of conditions comprising detection of:
one or more visual 2D markers, one or more visual 3D markers, one or more visual signatures, one or more visual geometric properties, one or more visual semantic properties, one or more audio markers, one or more audio properties, one or more temperature conditions, one or more movement of real or virtual objects, one or more hygrometry conditions, one or more lighting changes, and one or more wind conditions (FIGS. 10 and 11; paragraph [0148], anchor point detection unit 332 may determine a type of anchor point to be used from the data of scene graph 324, and identify a corresponding real-world anchor point using image data from camera 308. Presentation unit 330 may use the identified anchor point in the real-world presentation environment to align a virtual scene with the real-world presentation environment. For example, the real-world anchor point may be a point on the floor, a surface (e.g., a table), or the like. In some examples, a visual marker on the real-world object may be used, such as a quick response (QR) code, to represent the real-world anchor point).
Regarding claim 16, Bouazizi discloses everything claimed as applied above (see claim 10), and Bouazizi further disclose the trigger of one of the one or more anchors relies on a detection of an object in the real environment (FIGS. 10 and 11; paragraph [0146], camera 308 represents a camera used to capture images or video data of the real-world presentation environment ... Additionally or alternatively, camera 308 may detect real-world objects, such as the floor ...) and wherein the trigger is associated with a model of the object or with a semantic description of the object (Paragraph [0157], the scene description may include data for an anchor point, such as the MPEG_scene_anchor as discussed above with respect to Table 1. The data for the anchor point may include, for example, data indicating whether the reference space type (e.g., a real-world presentation environment) is to be a view, local, stage, or application type of reference space; paragraphs [0132]-[0133], the scene description may request that the presentation engine aligns the scene extents, i.e. the bounding box of the scene, to the bounding box of the anchor XR space ... If no scaling is applied, the presentation engine aligns the long edge of the scene bounding box to that of the XR space and then centers the scene bounding box to be collocated with the center of the XR space bounding box ... In the example of FIG. 9, the anchor XR space is of type “stage,” corresponding to the floor of the viewer's living room).
Regarding claim 17, Bouazizi discloses everything claimed as applied above (see claim 10), and Bouazizi further disclose wherein the at least an action of one of the one or more anchors is a member of a group of actions comprising:
playing, pausing or stopping a media content item;
modifying the description of the augmented reality scene (FIGS. 10 and 11; paragraph [0151], after aligning the virtual scene anchor point with the real-world anchor point and making any necessary transformations, presentation unit 330 may present media data 322 via display 314 ... Presentation unit 330 may update the presentation according to user movements detected from user interface devices 306, camera 308, and/or sensors 310, and/or based on updated to scene graph 324); and
connecting a remote device or service.
Regarding claim 19, Bouazizi discloses a non-transitory computer-readable medium carrying data representative of an augmented reality scene (Paragraph [0006], this disclosure describes techniques for streaming immersive media content, e.g., for extended reality (XR) content, such as augmented reality (AR) ... For example, the user may be able to navigate the virtual scene using controllers and/or real-world movement. In order to allow for proper movement in a real-world presentation environment ...; paragraph [0009], a computer-readable storage medium has stored thereon instructions that, when executed, cause a processor to receive a scene description of a bitstream ...) and comprising:
a description of the augmented reality scene (FIGS. 10 and 11; paragraph [0156], client device 300 may receive a bitstream including a scene description (350); paragraph [0134], according to the techniques of this disclosure, an MPEG_scene_anchor extension may be added as a glTF 2.0 extension to the scene node. An AR scene may contain the MPEG_scene_anchor extension to describe the anchoring of the scene to a real-world XR space), the description comprising:
a scene graph (Paragraph [0156], he scene description may include a scene graph ... For example, the scene description may correspond to the example scene graph of FIG. 5, scene description and updates 200 of FIG. 6, scene graph 262 and scene graph updates 264 of FIG. 8, or scene graph 324 of FIG. 10); and
one or more anchors, wherein each anchor of the one or more anchors is associated with one or more nodes of the scene graph (Paragraphs [0092]-[0094], FIG. 5 shows a scene graph ... Each node in the graph holds pointers to its children. The child nodes can, among others, be a group of other nodes, a geometry element, a transformation matrix, accessors to media data buffers, camera information for the rendering ... Spatial transformations are represented as nodes of the graph and represented by a transformation matrix. Typical usage of transform nodes is to describe rotation, translation or scaling of the objects in its child nodes ...; FIG. 1; paragraph [0068], client device 40 may be configured to perform the various techniques of this disclosure alone or in any combination. In general, retrieval unit 52 may be configured to retrieve a bitstream including media data (e.g., scene data), as discussed above, as well as a scene description. The scene description may include anchor point data representing a correspondence between a virtual scene represented by the media data and a real-world presentation environment. Client device 40 may be configured to anchor the virtual scene to the real-world presentation environment using the anchor point data, and also transform the virtual scene as needed, e.g., through rotation, translation, and/or scaling) and comprises:
a trigger, wherein the trigger is a description of at least one condition (Paragraph [0157], the scene description may include data for an anchor point, such as the MPEG_scene_anchor as discussed above with respect to Table 1. The data for the anchor point may include, for example, data indicating whether the reference space type (e.g., a real-world presentation environment) is to be a view, local, stage, or application type of reference space); wherein the at least one condition is a detection of a visual or audio or environment-based marker or property (Paragraph [0158], presentation unit 330 may automatically detect the real-world presentation environment using camera 308. In some examples, client device 300 may receive image and/or video data captured by camera 308 and upload this data via a sceneUnderstandingStream as indicated by the scene description), and wherein the trigger is activated when a condition of the at least one condition is detected in a real environment (Paragraph [0157], the data for the anchor point may further indicate whether the scene data is to be transformed (e.g., rotated, translated, and/or scaled) to match the real-world presentation environment ... In general, the scene description may include data that relates the scene anchor point to a real-world anchor point, such as a particular location on the floor (e.g., a midpoint of the floor)); and
an action, wherein the action comprises a description of process to be performed by an augmented reality engine (Paragraph [0157], the anchor point may further include various actions that a user may perform, such as movements received via a controller or real-world repositioning of a device worn by the user); and
media content items linked to the nodes of the scene graph (Paragraph [0142], scene data 320 represents one or more memories (storage devices) for storing media data 322; paragraph [0151], for example, media data 322 may include data defining virtual objects, textures, colors, and locations for the virtual objects).
It's noted that Bouazizi does not use term of “a trigger” by the techniques described in the disclosure. However, the claim recites “a trigger, wherein the trigger is a description of at least one condition”. Paragraph [0157] of Bouazizi describes “The data for the anchor point may include, for example, data indicating whether the reference space type (e.g., a real-world presentation environment) is to be a view, local, stage, or application type of reference space” and the data describes at least one type “a view, local, stage, or application” of reference space. Thus, the “type” described by Bouazizi can be considered equivalent to a “condition”. Accordingly, under broadest reasonable interpretation, “the data” described by Bouazizi can be considered equivalent to “a trigger” recited in the claim. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to understand Bouazizi disclose the invention as specified in claim.
Regarding claim 21, Bouazizi discloses everything claimed as applied above (see claim 19), and Bouazizi further disclose the trigger relies on a detection of an object in the real environment (FIGS. 10 and 11; paragraph [0146], camera 308 represents a camera used to capture images or video data of the real-world presentation environment ... Additionally or alternatively, camera 308 may detect real-world objects, such as the floor ...) and wherein the trigger is associated with a model of the object or with a semantic description of the object (Paragraph [0157], the scene description may include data for an anchor point, such as the MPEG_scene_anchor as discussed above with respect to Table 1. The data for the anchor point may include, for example, data indicating whether the reference space type (e.g., a real-world presentation environment) is to be a view, local, stage, or application type of reference space; paragraphs [0132]-[0133], the scene description may request that the presentation engine aligns the scene extents, i.e. the bounding box of the scene, to the bounding box of the anchor XR space ... If no scaling is applied, the presentation engine aligns the long edge of the scene bounding box to that of the XR space and then centers the scene bounding box to be collocated with the center of the XR space bounding box ... In the example of FIG. 9, the anchor XR space is of type “stage,” corresponding to the floor of the viewer's living room).
Claims 3-4 and 12-13 are rejected under 35 U.S.C. 103 as being unpatentable over Bouazizi et al (U.S. Patent Application Publication 2022/0335694 A1) in view of Smet et al (U.S. Patent No. 10,825,258 B1).
Regarding claim 3, Bouazizi discloses everything claimed as applied above (see claim 1).
However, Bouazizi does not specifically disclose further comprising, when the one or more limit conditions are no longer observed in the augmented reality scene, unloading the media content items linked to the at least one of the one or more nodes.
In additional, Smet discloses (Abstract, a method includes by a computing device, displaying a user interface for designing augmented-reality effects. The method includes receiving user input through the user interface. The method includes displaying a graph generated based on the user input ...) further comprising, when the one or more limit conditions are no longer observed in the augmented reality scene (Col 14, lines 24-67, FIGS. 5A-5D illustrate example scene graphs associated with a variety of augmented-reality effects. FIG. 5A illustrates a scene graph 500a corresponding to an augmented-reality effect wherein a virtual statue object is render in an open, palm-up hand in the scene ...; Col 15, lines 3-33, FIG. 5B illustrates a scene graph 500b corresponding to an augmented-reality effect wherein gestures detected in association with face object instances affect the visibility and animation of assets in the effect. The scene graph 500b comprises a collective node 510b labeled “Facetracker” that corresponds to a module for identifying and tracking the first two recognized female faces in the scene. Thus, palm-up hand in the scene is no longer observed in the augmented reality scene after detecting face object), unloading the media content items linked to the at least one of the one or more nodes (Col 15, lines 3-33, the scene graph 500b comprises a collective node 510b labeled “Facetracker” that corresponds to a module for identifying and tracking the first two recognized female faces in the scene ... Thus, the scene graph 500a of “Handtracker” is unloaded after the scene graph 500b of “Facetracker” is loaded).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the scene description taught by Bouazizi incorporate the teachings of Smet, and applying the graph-based design of augmented-reality effects taught by Smet to collect nodes shown in the graph based on the object detection and provide the object type appearing in a scene to the user. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Bouazizi according to the relied-upon teachings of Smet to obtain the invention as specified in claim.
Regarding claim 4, Bouazizi discloses everything claimed as applied above (see claim 1).
However, Bouazizi does not specifically disclose wherein the trigger of one of the one or more anchors comprises a descriptor indicating whether the action of the one or more anchors continues once the trigger is no longer activated.
In additional, Smet discloses (Abstract, a method includes by a computing device, displaying a user interface for designing augmented-reality effects. The method includes receiving user input through the user interface. The method includes displaying a graph generated based on the user input ...) wherein the trigger of one of the one or more anchors (Col 14, lines 24-67, FIGS. 5A-5D illustrate example scene graphs associated with a variety of augmented-reality effects. FIG. 5A illustrates a scene graph 500a corresponding to an augmented-reality effect wherein a virtual statue object is render in an open, palm-up hand in the scene ...; Col 15, lines 3-33, FIG. 5B illustrates a scene graph 500b corresponding to an augmented-reality effect wherein gestures detected in association with face object instances affect the visibility and animation of assets in the effect. The scene graph 500b comprises a collective node 510b labeled “Facetracker” that corresponds to a module for identifying and tracking the first two recognized female faces in the scene) comprises a descriptor indicating whether the action of the one or more anchors continues once the trigger is no longer activated (Col 15, lines 3-33, the scene graph 500b comprises a collective node 510b labeled “Facetracker” that corresponds to a module for identifying and tracking the first two recognized female faces in the scene ... Thus, the scene graph 500a of “Handtracker” is no longer activated after the scene graph 500b of “Facetracker” is loaded).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the scene description taught by Bouazizi incorporate the teachings of Smet, and applying the graph-based design of augmented-reality effects taught by Smet to collect nodes shown in the graph based on the object detection and provide the object type appearing in a scene to the user. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Bouazizi according to the relied-upon teachings of Smet to obtain the invention as specified in claim.
Regarding claim 12, Bouazizi discloses everything claimed as applied above (see claim 11).
However, Bouazizi does not specifically disclose wherein the processor is further configured for, when the one or more limit conditions are no longer observed in the augmented reality scene, unloading the media content items linked to the at least one of the one or more nodes.
In additional, Smet discloses (Abstract, a method includes by a computing device, displaying a user interface for designing augmented-reality effects. The method includes receiving user input through the user interface. The method includes displaying a graph generated based on the user input ...) wherein the processor is further configured for, when the one or more limit conditions are no longer observed in the augmented reality scene (Col 14, lines 24-67, FIGS. 5A-5D illustrate example scene graphs associated with a variety of augmented-reality effects. FIG. 5A illustrates a scene graph 500a corresponding to an augmented-reality effect wherein a virtual statue object is render in an open, palm-up hand in the scene ...; Col 15, lines 3-33, FIG. 5B illustrates a scene graph 500b corresponding to an augmented-reality effect wherein gestures detected in association with face object instances affect the visibility and animation of assets in the effect. The scene graph 500b comprises a collective node 510b labeled “Facetracker” that corresponds to a module for identifying and tracking the first two recognized female faces in the scene. Thus, palm-up hand in the scene is no longer observed in the augmented reality scene after detecting face object), unloading the media content items linked to the at least one of the one or more nodes (Col 15, lines 3-33, the scene graph 500b comprises a collective node 510b labeled “Facetracker” that corresponds to a module for identifying and tracking the first two recognized female faces in the scene ... Thus, the scene graph 500a of “Handtracker” is unloaded after the scene graph 500b of “Facetracker” is loaded).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the scene description taught by Bouazizi incorporate the teachings of Smet, and applying the graph-based design of augmented-reality effects taught by Smet to collect nodes shown in the graph based on the object detection and provide the object type appearing in a scene to the user. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Bouazizi according to the relied-upon teachings of Smet to obtain the invention as specified in claim.
Regarding claim 13, Bouazizi discloses everything claimed as applied above (see claim 10).
However, Bouazizi does not specifically disclose wherein the trigger of one of the one or more anchors comprises a descriptor indicating whether the action of the one or more anchors continues once the trigger is no longer activated.
In additional, Smet discloses (Abstract, a method includes by a computing device, displaying a user interface for designing augmented-reality effects. The method includes receiving user input through the user interface. The method includes displaying a graph generated based on the user input ...) wherein the trigger of one of the one or more anchors (Col 14, lines 24-67, FIGS. 5A-5D illustrate example scene graphs associated with a variety of augmented-reality effects. FIG. 5A illustrates a scene graph 500a corresponding to an augmented-reality effect wherein a virtual statue object is render in an open, palm-up hand in the scene ...; Col 15, lines 3-33, FIG. 5B illustrates a scene graph 500b corresponding to an augmented-reality effect wherein gestures detected in association with face object instances affect the visibility and animation of assets in the effect. The scene graph 500b comprises a collective node 510b labeled “Facetracker” that corresponds to a module for identifying and tracking the first two recognized female faces in the scene) comprises a descriptor indicating whether the action of the one or more anchors continues once the trigger is no longer activated (Col 15, lines 3-33, the scene graph 500b comprises a collective node 510b labeled “Facetracker” that corresponds to a module for identifying and tracking the first two recognized female faces in the scene ... Thus, the scene graph 500a of “Handtracker” is no longer activated after the scene graph 500b of “Facetracker” is loaded).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the scene description taught by Bouazizi incorporate the teachings of Smet, and applying the graph-based design of augmented-reality effects taught by Smet to collect nodes shown in the graph based on the object detection and provide the object type appearing in a scene to the user. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Bouazizi according to the relied-upon tea