Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1-2, 7-8, 14-15, 19-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Caswell et al. (US Pub 2021/0142580 A1).
As to claim 1, Caswell discloses a computer-implemented method comprising:
capturing, with a mobile device comprising a camera and a display, a first image of a physical environment along with location data and orientation data corresponding to the first image (¶0138, “the AR system 502 may include sensors 522 configured to capture information about the physical world 506.” ¶0139-0140, “The image sensors may acquire monocular or stereoscopic information that may be processed to represent the physical world in other ways.” ¶0141, “The head pose tracking component may represent a headpose of a user in a coordinate frame with six degrees of freedom including, for example, translation in three perpendicular axes (e.g., forward/backward, up/down, left/right) and rotation about the three perpendicular axes (e.g., pitch, yaw, and roll).” “relate image information to a particular portion of the physical world or to relate the position of the display worn on the user's head to the physical world.” ¶0142, “the head pose tracking component may compute relative position and orientation of an AR device to physical objects based on visual information captured by cameras and inertial information captured by IMUs.”);
transmitting, to a server system via a wireless communication channel, the location and orientation data (Fig. 4, 0157, “the wearable deice may communicate with remote components. The local data processing module 570 may be operatively coupled by communication links 576, 578, such as via a wired or wireless communication links, to the remote processing module 572 and remote data repository 574, respectively, such that these remote modules 572, 574 are operatively coupled to each other and available as resources to the local data processing module 570.” ¶0158, “processing may be distributed across local and remote processors.” ¶0162, “the remote data repository 574 may include a digital data storage facility, which may be available through the Internet or other networking configuration in a “cloud” resource configuration.” ¶0162, “all data is stored and all or most computations are performed in the remote data repository 574, allowing for a smaller device. A world reconstruction, for example, may be stored in whole or in part in this repository 574.”);
determining, with the server system, a precise location and orientation of the mobile device in the physical environment based on the location and orientation data received from the mobile device and on information accessed from multiple sources of data including one or more of a street map database, visual positioning system (VPS) data, and image anchors (Fig. 6B, ¶0166, “The information captured by the AR system along the movement path of the user may be processed into one or more tracking maps. The user 530 positions the AR display system at positions 534, and the AR display system records ambient information of a passable world (e.g., a digital representation of the real objects in the physical world that can be stored and updated with changes to the real objects in the physical world) relative to the positions 534. That information may be stored as poses in combination with images, features, directional audio inputs, or other desired data. The positions 534 are aggregated to data inputs 536, for example, as part of a tracking map, and processed at least by a passable world module 538, which may be implemented, for example, by processing on a remote processing module 572 of FIG. 4. In some embodiments, the passable world module 538 may include the head pose component 514 and the world reconstruction component 516, such that the processed information may indicate the location of objects in the physical world in combination with other information about physical objects used in rendering virtual content.” ¶0169, “The mesh model 546 of the physical world may be created by the AR display system and appropriate surfaces and metrics for interacting and displaying the AR content 540 can be stored by the passable world module 538 for future retrieval by the user 530 or other users without the need to completely or partially recreate the model. In some embodiments, the data inputs 536 are inputs such as geolocation, user identification, and current activity to indicate to the passable world module 538 which fixed element 542 of one or more fixed elements are available, which AR content 540 has last been placed on the fixed element 542, and whether to display that same content (such AR content being “persistent” content regardless of user viewing a particular passable world model).” ¶0170, “To render an AR scene with a realistic feel, the AR system may update the position of these non-fixed objects with a much higher frequency than is used to update fixed objects. To enable accurate tracking of all of the objects in the physical world, an AR system may draw information from multiple sensors, including one or more image sensors.” ¶0199-0203.);
rendering, with the server system, an image of virtual content based on the precise location of the mobile device determined by the server system (¶0151, “The AR content may be generated based on this information, such as by AR applications 504. An AR application 504 may be a game program, for example, that performs one or more functions based on information about the physical world, such as visual occlusion, physics-based interactions, and environment reasoning. It may perform these functions by querying data in different formats from the reconstruction 518 produced by the world reconstruction component 516. In some embodiments, component 520 may be configured to output updates when a representation in a region of interest of the physical world changes. That region of interest, for example, may be set to approximate a portion of the physical world in the vicinity of the user of the system, such as the portion within the view field of the user, or is projected (predicted/determined) to come within the view field of the user.” ¶0167, “The AR content is “placed” in the physical world by presenting via the user interface both a representation of the physical world and the AR content, with the AR content rendered as if it were interacting with objects in the physical world and the objects in the physical world presented as if the AR content were, when appropriate, obscuring the user's view of those objects.” ¶0180, “Regardless of how content is presented to a user, a model of the physical world may be used so that characteristics of the virtual objects, which can be impacted by physical objects, including the shape, position, motion, and visibility of the virtual object, can be correctly computed. In some embodiments, the model may include the reconstruction of a physical world, for example, the reconstruction 518.”);
transmitting the rendered image back to the mobile device via the wireless communication channel (¶0157, “the wearable deice may communicate with remote components. The local data processing module 570 may be operatively coupled by communication links 576, 578, such as via a wired or wireless communication links, to the remote processing module 572 and remote data repository 574, respectively, such that these remote modules 572, 574 are operatively coupled to each other and available as resources to the local data processing module 570.” ¶0586, “Such computers may be interconnected by one or more networks in any suitable form, including as a local area network or a wide area network, such as an enterprise network or the Internet. Such networks may be based on any suitable technology and may operate according to any suitable protocol and may include wireless networks, wired networks or fiber optic networks.);
compositing, at the mobile device, the rendered image with an image of the physical environment captured by the mobile device (¶0152, “The AR applications 504 may use this information to generate and update the AR contents. The virtual portion of the AR contents may be presented on the display 508 in combination with the see-through reality 510, creating a realistic user experience.”); and
displaying the composited image on the display of the mobile device such that the virtual content appears in the physical environment (¶0131, “the user of the AR technology also perceives that they “see” a robot statue 357 standing upon the physical world concrete platform 358, and a cartoon-like avatar character 352 flying by which seems to be a personification of a bumble bee, even though these elements (e.g., the avatar character 352, and the robot statue 357) do not exist in the physical world.”).
As to claim 2, claim 1 is incorporated and Caswell discloses the orientation data comprises roll, yaw and pitch data (¶0141, “The head pose tracking component may represent a headpose of a user in a coordinate frame with six degrees of freedom including, for example, translation in three perpendicular axes (e.g., forward/backward, up/down, left/right) and rotation about the three perpendicular axes (e.g., pitch, yaw, and roll).”).
As to claim 7, Caswell discloses a computer-implemented method comprising:
capturing, with a mobile device comprising a camera and a display, a first image of a physical environment and first set of telemetry data corresponding with the first image, wherein the first set of telemetry data comprises GPS location data along with orientation data (¶0138, “the AR system 502 may include sensors 522 configured to capture information about the physical world 506.” ¶0139-0140, “The image sensors may acquire monocular or stereoscopic information that may be processed to represent the physical world in other ways.” ¶0141, “The head pose tracking component may represent a headpose of a user in a coordinate frame with six degrees of freedom including, for example, translation in three perpendicular axes (e.g., forward/backward, up/down, left/right) and rotation about the three perpendicular axes (e.g., pitch, yaw, and roll).” “relate image information to a particular portion of the physical world or to relate the position of the display worn on the user's head to the physical world.” ¶0142, “the head pose tracking component may compute relative position and orientation of an AR device to physical objects based on visual information captured by cameras and inertial information captured by IMUs.” ¶0149, “The metadata, for example, may indicate time of capture of the sensor information used to form the map. Metadata alternatively or additionally may indicate location of the sensors at the time of capture of information used to form the map. Location may be expressed directly, such as with information from a GPS chip”);
transmitting, to a server system via a wireless communication channel, the first set of telemetry data (Fig. 4, 0157, “the wearable deice may communicate with remote components. The local data processing module 570 may be operatively coupled by communication links 576, 578, such as via a wired or wireless communication links, to the remote processing module 572 and remote data repository 574, respectively, such that these remote modules 572, 574 are operatively coupled to each other and available as resources to the local data processing module 570.” ¶0158, “processing may be distributed across local and remote processors.” ¶0162, “the remote data repository 574 may include a digital data storage facility, which may be available through the Internet or other networking configuration in a “cloud” resource configuration.” ¶0162, “all data is stored and all or most computations are performed in the remote data repository 574, allowing for a smaller device. A world reconstruction, for example, may be stored in whole or in part in this repository 574.”);
determining, with the server system, a precise location and orientation of the mobile device in the physical environment based on the first set of telemetry data received from the mobile device and on information accessed from multiple sources of data including one or more of a street map database, visual positioning system (VPS) data, and image anchors (Fig. 6B, ¶0166, “The information captured by the AR system along the movement path of the user may be processed into one or more tracking maps. The user 530 positions the AR display system at positions 534, and the AR display system records ambient information of a passable world (e.g., a digital representation of the real objects in the physical world that can be stored and updated with changes to the real objects in the physical world) relative to the positions 534. That information may be stored as poses in combination with images, features, directional audio inputs, or other desired data. The positions 534 are aggregated to data inputs 536, for example, as part of a tracking map, and processed at least by a passable world module 538, which may be implemented, for example, by processing on a remote processing module 572 of FIG. 4. In some embodiments, the passable world module 538 may include the head pose component 514 and the world reconstruction component 516, such that the processed information may indicate the location of objects in the physical world in combination with other information about physical objects used in rendering virtual content.” ¶0169, “The mesh model 546 of the physical world may be created by the AR display system and appropriate surfaces and metrics for interacting and displaying the AR content 540 can be stored by the passable world module 538 for future retrieval by the user 530 or other users without the need to completely or partially recreate the model. In some embodiments, the data inputs 536 are inputs such as geolocation, user identification, and current activity to indicate to the passable world module 538 which fixed element 542 of one or more fixed elements are available, which AR content 540 has last been placed on the fixed element 542, and whether to display that same content (such AR content being “persistent” content regardless of user viewing a particular passable world model).” ¶0170, “To render an AR scene with a realistic feel, the AR system may update the position of these non-fixed objects with a much higher frequency than is used to update fixed objects. To enable accurate tracking of all of the objects in the physical world, an AR system may draw information from multiple sensors, including one or more image sensors.” ¶0199-0203.);
rendering an image of virtual content based at least in part on the first set of telemetry data for the mobile device, wherein the rendered image of virtual content is over rendered such that it represents a larger area in the virtual world than the first image represents in the physical environment (¶0151, “The AR content may be generated based on this information, such as by AR applications 504. An AR application 504 may be a game program, for example, that performs one or more functions based on information about the physical world, such as visual occlusion, physics-based interactions, and environment reasoning. It may perform these functions by querying data in different formats from the reconstruction 518 produced by the world reconstruction component 516. In some embodiments, component 520 may be configured to output updates when a representation in a region of interest of the physical world changes. That region of interest, for example, may be set to approximate a portion of the physical world in the vicinity of the user of the system, such as the portion within the view field of the user, or is projected (predicted/determined) to come within the view field of the user.” ¶0167, “The AR content is “placed” in the physical world by presenting via the user interface both a representation of the physical world and the AR content, with the AR content rendered as if it were interacting with objects in the physical world and the objects in the physical world presented as if the AR content were, when appropriate, obscuring the user's view of those objects.” ¶0180, “Regardless of how content is presented to a user, a model of the physical world may be used so that characteristics of the virtual objects, which can be impacted by physical objects, including the shape, position, motion, and visibility of the virtual object, can be correctly computed. In some embodiments, the model may include the reconstruction of a physical world, for example, the reconstruction 518.” ¶0168, “the passable world module 538 may recognize the environment 532 from a previously mapped environment and display AR content without a device of the user 530 mapping all or part of the environment 532 first, saving computation process and cycles and avoiding latency of any rendered AR content.” ¶0209-0210, ¶0214, ¶0244, “An image reprojection may be applied to the virtual content to account for a change in eye position, however, as the rendering is still in the same position, jitter is minimized”);
transmitting the rendered image and the first set of telemetry data back to the mobile device via the wireless communication channel (¶0157, “the wearable deice may communicate with remote components. The local data processing module 570 may be operatively coupled by communication links 576, 578, such as via a wired or wireless communication links, to the remote processing module 572 and remote data repository 574, respectively, such that these remote modules 572, 574 are operatively coupled to each other and available as resources to the local data processing module 570. “ ¶0586, “Such computers may be interconnected by one or more networks in any suitable form, including as a local area network or a wide area network, such as an enterprise network or the Internet. Such networks may be based on any suitable technology and may operate according to any suitable protocol and may include wireless networks, wired networks or fiber optic networks.);
capturing, with the mobile device, a second image and a second set of telemetry data, wherein the second set of telemetry data comprises gps location data along with orientation data (¶0143, “the AR device may construct a map from the feature points recognized in successive images in a series of image frames captured as a user moves throughout the physical world with the AR device. Though each image frame may be taken from a different pose as the user moves, the system may adjust the orientation of the features of each successive image frame to match the orientation of the initial image frame by matching features of the successive image frames to previously captured image frames. Translations of the successive image frames so that points representing the same features will match corresponding feature points from previously collected image frames, can be used to align each successive image frame to match the orientation of previously processed image frames. The frames in the resulting map may have a common orientation established when the first image frame was added to the map. This map, with sets of feature points in a common frame of reference, may be used to determine the user's pose within the physical world by matching features from current image frames to the map. In some embodiments, this map may be called a tracking map.” ¶146, “the selected image frames, or groups of features from selected image frames may serve as key frames for the map, which are used to provide spatial information.” ¶0147, “The AR system 502 may integrate sensor data over time from multiple viewpoints of a physical world. The poses of the sensors (e.g., position and orientation) may be tracked as a device including the sensors is moved.” ¶0149, “The metadata, for example, may indicate time of capture of the sensor information used to form the map. Metadata alternatively or additionally may indicate location of the sensors at the time of capture of information used to form the map. Location may be expressed directly, such as with information from a GPS chip” ¶0170, “To render an AR scene with a realistic feel, the AR system may update the position of these non-fixed objects with a much higher frequency than is used to update fixed objects. To enable accurate tracking of all of the objects in the physical world, an AR system may draw information from multiple sensors, including one or more image sensors.”);
compositing, at the mobile device, a subset of the rendered image with the second image, wherein the subset of the rendered image is determined based changes between the first and second sets of telemetry data (¶0152, “The AR applications 504 may use this information to generate and update the AR contents. The virtual portion of the AR contents may be presented on the display 508 in combination with the see-through reality 510, creating a realistic user experience.” ¶0179, ¶0183, “In addition to generating information for a persisted world representation, the perception module 660 may identify and output indications of changes in a region around a user of an AR system. Indications of such changes may trigger updates to volumetric data stored as part of the persisted world, or trigger other functions, such as triggering components 604 that generate AR content to update the AR content.” ¶0203, “A relative pose may be adequate for a tracking map, as the map may be relative to a coordinate system local to a device established based on the initial pose of the device when construction of the tracking map was initiated.”); and
displaying the composited image on the display of the mobile device such that the virtual content appears in the physical environment (¶0131, “the user of the AR technology also perceives that they “see” a robot statue 357 standing upon the physical world concrete platform 358, and a cartoon-like avatar character 352 flying by which seems to be a personification of a bumble bee, even though these elements (e.g., the avatar character 352, and the robot statue 357) do not exist in the physical world.”).
As to claim 8, claim 7 is incorporated and Caswell discloses the orientation data comprises roll, yaw and pitch data (See claim 1 for detailed analysis.).
As to claim 14, Caswell discloses a computer-implemented method comprising:
capturing a first image of a physical environment and first set of telemetry data corresponding with the first image, wherein the first set of telemetry data comprises gps location data and orientation data (¶0138, “the AR system 502 may include sensors 522 configured to capture information about the physical world 506.” ¶0139-0140, “The image sensors may acquire monocular or stereoscopic information that may be processed to represent the physical world in other ways.” ¶0141, “The head pose tracking component may represent a headpose of a user in a coordinate frame with six degrees of freedom including, for example, translation in three perpendicular axes (e.g., forward/backward, up/down, left/right) and rotation about the three perpendicular axes (e.g., pitch, yaw, and roll).” “relate image information to a particular portion of the physical world or to relate the position of the display worn on the user's head to the physical world.” ¶0142, “the head pose tracking component may compute relative position and orientation of an AR device to physical objects based on visual information captured by cameras and inertial information captured by IMUs.” ¶0149, “The metadata, for example, may indicate time of capture of the sensor information used to form the map. Metadata alternatively or additionally may indicate location of the sensors at the time of capture of information used to form the map. Location may be expressed directly, such as with information from a GPS chip”);
transmitting the first set of telemetry data to a server system via a wireless communication channel (Fig. 4, 0157, “the wearable deice may communicate with remote components. The local data processing module 570 may be operatively coupled by communication links 576, 578, such as via a wired or wireless communication links, to the remote processing module 572 and remote data repository 574, respectively, such that these remote modules 572, 574 are operatively coupled to each other and available as resources to the local data processing module 570.” ¶0158, “processing may be distributed across local and remote processors.” ¶0162, “the remote data repository 574 may include a digital data storage facility, which may be available through the Internet or other networking configuration in a “cloud” resource configuration.” ¶0162, “all data is stored and all or most computations are performed in the remote data repository 574, allowing for a smaller device. A world reconstruction, for example, may be stored in whole or in part in this repository 574.”);
receiving a rendered image of virtual content from the server system via the wireless communication channel, wherein the rendered image is generated based at least in part on the first set of telemetry data and wherein the rendered image of virtual content is over rendered such that it represents a larger area in the virtual world than the first image represents in the physical environment (¶0151, “The AR content may be generated based on this information, such as by AR applications 504. An AR application 504 may be a game program, for example, that performs one or more functions based on information about the physical world, such as visual occlusion, physics-based interactions, and environment reasoning. It may perform these functions by querying data in different formats from the reconstruction 518 produced by the world reconstruction component 516. In some embodiments, component 520 may be configured to output updates when a representation in a region of interest of the physical world changes. That region of interest, for example, may be set to approximate a portion of the physical world in the vicinity of the user of the system, such as the portion within the view field of the user, or is projected (predicted/determined) to come within the view field of the user.” ¶0167, “The AR content is “placed” in the physical world by presenting via the user interface both a representation of the physical world and the AR content, with the AR content rendered as if it were interacting with objects in the physical world and the objects in the physical world presented as if the AR content were, when appropriate, obscuring the user's view of those objects.” ¶0180, “Regardless of how content is presented to a user, a model of the physical world may be used so that characteristics of the virtual objects, which can be impacted by physical objects, including the shape, position, motion, and visibility of the virtual object, can be correctly computed. In some embodiments, the model may include the reconstruction of a physical world, for example, the reconstruction 518.” ¶0168, “the passable world module 538 may recognize the environment 532 from a previously mapped environment and display AR content without a device of the user 530 mapping all or part of the environment 532 first, saving computation process and cycles and avoiding latency of any rendered AR content.” ¶0209-0210, ¶0214,);
capturing a second image of a physical environment and second set of telemetry data corresponding with the second image, wherein the second set of telemetry data comprises gps location data and orientation data (¶0143, “the AR device may construct a map from the feature points recognized in successive images in a series of image frames captured as a user moves throughout the physical world with the AR device. Though each image frame may be taken from a different pose as the user moves, the system may adjust the orientation of the features of each successive image frame to match the orientation of the initial image frame by matching features of the successive image frames to previously captured image frames. Translations of the successive image frames so that points representing the same features will match corresponding feature points from previously collected image frames, can be used to align each successive image frame to match the orientation of previously processed image frames. The frames in the resulting map may have a common orientation established when the first image frame was added to the map. This map, with sets of feature points in a common frame of reference, may be used to determine the user's pose within the physical world by matching features from current image frames to the map. In some embodiments, this map may be called a tracking map.” ¶146, “the selected image frames, or groups of features from selected image frames may serve as key frames for the map, which are used to provide spatial information.” ¶0147, “The AR system 502 may integrate sensor data over time from multiple viewpoints of a physical world. The poses of the sensors (e.g., position and orientation) may be tracked as a device including the sensors is moved.” ¶0149, “The metadata, for example, may indicate time of capture of the sensor information used to form the map. Metadata alternatively or additionally may indicate location of the sensors at the time of capture of information used to form the map. Location may be expressed directly, such as with information from a GPS chip” ¶0170, “To render an AR scene with a realistic feel, the AR system may update the position of these non-fixed objects with a much higher frequency than is used to update fixed objects. To enable accurate tracking of all of the objects in the physical world, an AR system may draw information from multiple sensors, including one or more image sensors.”);
compositing, at the mobile device, a subset of the rendered image with the second image, wherein the subset of the rendered image is determined based changes between the first and second sets of telemetry data (¶0152, “The AR applications 504 may use this information to generate and update the AR contents. The virtual portion of the AR contents may be presented on the display 508 in combination with the see-through reality 510, creating a realistic user experience.” ¶0179, ¶0183, “In addition to generating information for a persisted world representation, the perception module 660 may identify and output indications of changes in a region around a user of an AR system. Indications of such changes may trigger updates to volumetric data stored as part of the persisted world, or trigger other functions, such as triggering components 604 that generate AR content to update the AR content.” ¶0203, “A relative pose may be adequate for a tracking map, as the map may be relative to a coordinate system local to a device established based on the initial pose of the device when construction of the tracking map was initiated.”); and
displaying the composited image on the display of the mobile device such that the virtual content appears in the physical environment (¶0131, “the user of the AR technology also perceives that they “see” a robot statue 357 standing upon the physical world concrete platform 358, and a cartoon-like avatar character 352 flying by which seems to be a personification of a bumble bee, even though these elements (e.g., the avatar character 352, and the robot statue 357) do not exist in the physical world.”).
As to claim 15, claim 14 is incorporated and Caswell discloses the orientation data comprises roll, yaw and pitch data (See claim 1 for detailed analysis.).
As to claim 19, Caswell discloses a computer-implemented method comprising:
receiving, from a mobile device comprising a camera and a display, a first set of telemetry data corresponding with a first image, wherein the first set of telemetry data comprises GPS location data along with orientation data (¶0138, “the AR system 502 may include sensors 522 configured to capture information about the physical world 506.” ¶0139-0140, “The image sensors may acquire monocular or stereoscopic information that may be processed to represent the physical world in other ways.” ¶0141, “The head pose tracking component may represent a headpose of a user in a coordinate frame with six degrees of freedom including, for example, translation in three perpendicular axes (e.g., forward/backward, up/down, left/right) and rotation about the three perpendicular axes (e.g., pitch, yaw, and roll).” “relate image information to a particular portion of the physical world or to relate the position of the display worn on the user's head to the physical world.” ¶0142, “the head pose tracking component may compute relative position and orientation of an AR device to physical objects based on visual information captured by cameras and inertial information captured by IMUs.” ¶0149, “The metadata, for example, may indicate time of capture of the sensor information used to form the map. Metadata alternatively or additionally may indicate location of the sensors at the time of capture of information used to form the map. Location may be expressed directly, such as with information from a GPS chip” Fig. 4, 0157, “the wearable deice may communicate with remote components. The local data processing module 570 may be operatively coupled by communication links 576, 578, such as via a wired or wireless communication links, to the remote processing module 572 and remote data repository 574, respectively, such that these remote modules 572, 574 are operatively coupled to each other and available as resources to the local data processing module 570.” ¶0158, “processing may be distributed across local and remote processors.” ¶0162, “the remote data repository 574 may include a digital data storage facility, which may be available through the Internet or other networking configuration in a “cloud” resource configuration.” ¶0162, “all data is stored and all or most computations are performed in the remote data repository 574, allowing for a smaller device. A world reconstruction, for example, may be stored in whole or in part in this repository 574.”);
determining a precise location and orientation of the mobile device in the physical environment based on the first set of telemetry data received from the mobile device and on information accessed from multiple sources of data including one or more of a street map database, visual positioning system (VPS) data, and image anchors (Fig. 6B, ¶0166, “The information captured by the AR system along the movement path of the user may be processed into one or more tracking maps. The user 530 positions the AR display system at positions 534, and the AR display system records ambient information of a passable world (e.g., a digital representation of the real objects in the physical world that can be stored and updated with changes to the real objects in the physical world) relative to the positions 534. That information may be stored as poses in combination with images, features, directional audio inputs, or other desired data. The positions 534 are aggregated to data inputs 536, for example, as part of a tracking map, and processed at least by a passable world module 538, which may be implemented, for example, by processing on a remote processing module 572 of FIG. 4. In some embodiments, the passable world module 538 may include the head pose component 514 and the world reconstruction component 516, such that the processed information may indicate the location of objects in the physical world in combination with other information about physical objects used in rendering virtual content.” ¶0169, “The mesh model 546 of the physical world may be created by the AR display system and appropriate surfaces and metrics for interacting and displaying the AR content 540 can be stored by the passable world module 538 for future retrieval by the user 530 or other users without the need to completely or partially recreate the model. In some embodiments, the data inputs 536 are inputs such as geolocation, user identification, and current activity to indicate to the passable world module 538 which fixed element 542 of one or more fixed elements are available, which AR content 540 has last been placed on the fixed element 542, and whether to display that same content (such AR content being “persistent” content regardless of user viewing a particular passable world model).” ¶0170, “To render an AR scene with a realistic feel, the AR system may update the position of these non-fixed objects with a much higher frequency than is used to update fixed objects. To enable accurate tracking of all of the objects in the physical world, an AR system may draw information from multiple sensors, including one or more image sensors.” ¶0199-0203.);
rendering an image of virtual content based at least in part on the first set of telemetry data for the mobile device, wherein the rendered image of virtual content is over rendered such that it represents a larger area in the virtual world than the first image represents in the physical environment (¶0151, “The AR content may be generated based on this information, such as by AR applications 504. An AR application 504 may be a game program, for example, that performs one or more functions based on information about the physical world, such as visual occlusion, physics-based interactions, and environment reasoning. It may perform these functions by querying data in different formats from the reconstruction 518 produced by the world reconstruction component 516. In some embodiments, component 520 may be configured to output updates when a representation in a region of interest of the physical world changes. That region of interest, for example, may be set to approximate a portion of the physical world in the vicinity of the user of the system, such as the portion within the view field of the user, or is projected (predicted/determined) to come within the view field of the user.” ¶0167, “The AR content is “placed” in the physical world by presenting via the user interface both a representation of the physical world and the AR content, with the AR content rendered as if it were interacting with objects in the physical world and the objects in the physical world presented as if the AR content were, when appropriate, obscuring the user's view of those objects.” ¶0180, “Regardless of how content is presented to a user, a model of the physical world may be used so that characteristics of the virtual objects, which can be impacted by physical objects, including the shape, position, motion, and visibility of the virtual object, can be correctly computed. In some embodiments, the model may include the reconstruction of a physical world, for example, the reconstruction 518.” ¶0168, “the passable world module 538 may recognize the environment 532 from a previously mapped environment and display AR content without a device of the user 530 mapping all or part of the environment 532 first, saving computation process and cycles and avoiding latency of any rendered AR content.” ¶0209-0210, ¶0214); and
transmitting the rendered image and the first set of telemetry data back to the mobile device via the wireless communication channel (¶0157, “the wearable deice may communicate with remote components. The local data processing module 570 may be operatively coupled by communication links 576, 578, such as via a wired or wireless communication links, to the remote processing module 572 and remote data repository 574, respectively, such that these remote modules 572, 574 are operatively coupled to each other and available as resources to the local data processing module 570. “ ¶0586, “Such computers may be interconnected by one or more networks in any suitable form, including as a local area network or a wide area network, such as an enterprise network or the Internet. Such networks may be based on any suitable technology and may operate according to any suitable protocol and may include wireless networks, wired networks or fiber optic networks.).
As to claim 20, claim 19 is incorporated and Caswell discloses the orientation data comprises roll, yaw and pitch data (See claim 1 for detailed analysis.).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 3-4, 6, 9, 10, 12-13, 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Caswell et al. (US Pub 2021/0142580 A1) in view of Rasmussen et al. (US Patent 10,692,288).
As to claim 3, claim 1 is incorporated and Caswell does not disclose the rendered image sent by the server system comprises six channels of information including separate channels for red, green, blue, alpha, depth and shadow data.
Rasumussen teaches the rendered image sent by the server system comprises six channels of information including separate channels for red, green, blue, alpha, depth and shadow data (Rasumussen, Col 2, lines 25-65, “The second image may include a four-channel image with an alpha channel.” “The second image may include a cutout that at least partially obscures the view of the computer-generated object, where the cutout may correspond to an object in the physical environment.” “The mobile device may include a light sensor that captures lighting information about the physical environment from a perspective of the physical camera” “the method/operations may further include transmitting the lighting information to the content provider system, where the virtual environment may be lighted based on the lighting information about the physical environment.” lighting information is shadow data. Col 3, lines 17-20, “determining that a movement of the mobile device is less than a threshold amount, and causing the display to pan the third image rather than composite a new image based on the movement. The first image may include a first color palette, the second image may include a second color palette, and either the first image may be translated to the second color palette or the second image may be translated to the first color palette before the third image is generated. The second image may be compressed vertically when received by the mobile device such that a first half of the image comprises three-channel color information, and a second half comprises an alpha channel.” Col 6, lines 43-47, “The physical scanner 114 can devices that include infrared (IR) emitters and depth cameras that analyze IR patterns to build 3-D map of physical objects in the physical environment. In some embodiments, the depth cameras may be integrated as part of the mobile device 104.” Col 9, lines 55-65, “the depth measurements from the physical scanner can be used to construct a 3-D model of the physical environment 300” “Some embodiments may transmit 3-D models and/or geometries to the content provider system, while other embodiments may transmit raw sensor data, such as a depth image representing the physical environment as a grayscale image recorded by the physical scanner.” Col 15, lines 19-20, “the content provider system may identify portions of the rendered image that includes those virtual objects having a virtual depth of 3 feet or more.”).
Caswell and Rasumussen are considered to be analogous art because all pertain to augmented reality. It would have been obvious before the effective filing date of the claimed invention to have modified Caswell with the features of “the rendered image sent by the server system comprises six channels of information including separate channels for red, green, blue, alpha, depth and shadow data.” as taught by Rasumussen. The suggestion/motivation would have been the mobile device may receive data (e.g., rendered images, alpha images, compositing rules, synchronization information, etc.) from the content provider system in order to composite an image rendered by the content provider system with an image captured from a camera of the mobile device (Rasumussen, Col 17, lines 16-39).
As to claim 4, claim 3 is incorporated and the combination of Caswell and Rasumussen discloses the six channels of information are each sent every frame by splitting each frame into top and bottom portions and sending the RGB data in one of the top and bottom portions and sending the alpha, shadow and depth data in the other of the top and bottom portions (Rasumussen, Col 3, lines 17-21, “The second image may be compressed vertically when received by the mobile device such that a first half of the image comprises three-channel color information, and a second half comprises an alpha channel.” It is obvious to include shadow and depth data in the second half since Rasumussen also utilized lighting and depth data. Col 17, lines 16-39, “These novel compression techniques may be necessary because existing image compression techniques assumed that an image was already been composited. Therefore, the existing compression techniques do not provide support for four-channel (three color channels and one alpha channel) images to be compressed and transmitted efficiently. Because the rendering operation is being completed on one computer system (i.e., the content provider system) and the compositing is being performed on another computer system (i.e., the mobile device 104), un-composited images with an alpha channel may need to be transmitted from one computer system to the other. Compressing the alpha image with the three-channel RGB image as described above reduces the bandwidth needed to transmit these images in real time in half.” “The mobile device 104 may receive data (e.g., rendered images, alpha images, compositing rules, synchronization information, etc.) from the content provider system in order to composite an image rendered by the content provider system with an image captured from a camera of the mobile device.” Col 16, lines 40-66, “The content provider system may then generate a frame to be sent to the mobile device. The upper portion of the frame may include the compressed alpha image. The bottom portion of the frame may include the compressed rendered image.”)
As to claim 6, claim 1 is incorporated and the combination of Caswell and Rasumussen discloses the telemetry data is sent by the mobile device to the server at a first frequency of between 30-120 Hz and the mobile device also sends camera settings data including data representing the field of view (FOV) and resolution of the camera and data indicating the focal length and f-stop of the lens that the first image was captured at (Rasumussen, Col 13, lines 26-51, “The update intervals may correspond to a frame rate of the mobile device 104 (e.g., 30 frames per second), such that the virtual environment is updated to correspond to each frame captured by the camera of the mobile device 104. As used herein, the term “real-time” may include real-time or near real-time computing such that results are displayed to the user at interactive frame rates. In some embodiments, “real-time” may include displaying a composite image on the screen of the mobile device within 5 ms, 10 ms, 20 ms, 30 ms, 50 ms, or 100 ms of a time at which the corresponding image was captured by the camera of the mobile device.” Col 8, lines 50-66, “In addition to the location and/or orientation of the mobile device 104, other information regarding the physical environment may also be provided to the content provider system. In some embodiments, lens information for the camera of the mobile device 104 may be provided to the content provider system. This information may be transmitted from the mobile device 104 to the content provider system as part of a calibration or startup routine. Alternatively, this information may be stored on the content provider system if the lens information from the mobile device 104 is known. Lens information may include a focal length, an f-stop, and/or physical characteristics of the lens of the camera itself, included lens distortion properties. Note that the focal length (and thus the f-stop) of the camera may change dynamically as the user moves through the physical environment 300 and zooms in/out or focuses on different aspects of the physical environment 300. Therefore, the focal length and other lens information may be updated in real-time as changes occur in the physical environment 300 such that they can be duplicated by the virtual camera in the virtual environment.”).
Caswell and Rasumussen are considered to be analogous art because all pertain to augmented reality. It would have been obvious before the effective filing date of the claimed invention to have modified Caswell with the features of “the telemetry data is sent by the mobile device to the server at a first frequency of between 30-120 Hz and the mobile device also sends camera settings data including data representing the field of view (FOV) and resolution of the camera and data indicating the focal length and f-stop of the lens that the first image was captured at” as taught by Rasumussen. The suggestion/motivation would have been the mobile device may receive data (e.g., rendered images, alpha images, compositing rules, synchronization information, etc.) from the content provider system in order to composite an image rendered by the content provider system with an image captured from a camera of the mobile device (Rasumussen, Col 17, lines 16-39).
As to claim 9, claim 7 is incorporated and the combination of Caswell and Rasumussen discloses the rendered image sent by the server system comprises six channels of information including separate channels for red, green, blue, alpha, depth and shadow data (See claim 3 for detailed analysis.).
As to claim 10, claim 9 is incorporated and the combination of Caswell and Rasumussen discloses the six channels of information are each sent every frame by splitting each frame into top and bottom portions and sending the RGB data in one of the top and bottom portions and sending the alpha, shadow and depth data in the other of the top and bottom portions (See claim 4 for detailed analysis.).
As to claim 12, claim 7 is incorporated and the combination of Caswell and Rasumussen discloses the telemetry data is sent by the mobile device to the server at a first frequency of between 30-120 Hz and the mobile device also sends camera settings data including data representing the field of view (FOV) and resolution of the camera and data indicating the focal length and f-stop of the lens that the first image was captured at (See claim 6 for detailed analysis.).
As to claim 13, claim 7 is incorporated and the combination of Caswell and Rasumussen discloses the mobile device also sends an environ sphere comprising an RGB image to the server system via the wireless communication channel (Rasumussen, Col 6, lines 16-36, “This allows for the light sensors to provide a position and intensity of light sources as would be seen by the mobile device 104. In some embodiments, the light sensor may include a 360° camera system, such as the OZO™ system from Nokia®. A 360° camera can provide real-time video stream that can be used to stream a panoramic video texture to the content provider system 102.” OZO camera is a sphere camera.).
As to claim 16, claim 14 is incorporated and the combination of Caswell and Rasumussen discloses the rendered image sent by the server system comprises six channels of information including separate channels for red, green, blue, alpha, depth and shadow data (See claim 3 for detailed analysis.).
As to claim 17, claim 16 is incorporated and the combination of Caswell and Rasumussen discloses the six channels of information are each sent every frame by splitting each frame into top and bottom portions and sending the RGB data in one of the top and bottom portions and sending the alpha, shadow and depth data in the other of the top and bottom portions (See claim 4 for detailed analysis.).
Claims 5, 11, 18 are rejected under 35 U.S.C. 103 as being unpatentable over Caswell et al. (US Pub 2021/0142580 A1) in view of Rasmussen et al. (US Patent 10,692,288) and Bruls et al. (US Patent 10,567,728 B2).
As to claim 5, claim 3 is incorporated and Caswell does not disclose the six channels of information are each sent every other frame by repeatedly sending a first frame with the RGB data and, immediately following the first frame, sending a second frame with the alpha, shadow and depth data.
Bruls teaches the six channels of information are each sent every other frame by repeatedly sending a first frame with the RGB data and, immediately following the first frame, sending a second frame with the alpha, shadow and depth data (Bruls, abstract, “Extra frames (D, D′) are encoded that provide the depth components and further data for use in rendering based on the image and the depth components. The extra frames are encoded using spatial and/or temporal subsampling of the depth components and the further data, while the extra frames are interleaved with the image frames in the signal in a Group of Pictures coding structure (GOP).” Col 20, lines 6-10, “By moreover making use of 2:1, 2:2:1 interleaving extra frame insertions which can contain various components (such as depth components or transparency components)) of multiple time instances; e.g. Dt−1 and Dt−2, may be realized.” Col 21, lines 5-20, “As indicated above the content of the D′ component is typically not limited to depth but may also comprise background texture (BG), transparency (T) and additional metadata information. Metadata can be additional image information to improve the 3D perceived quality, but also content related information (e.g. signaling etc.). Typical components are D ((foreground) depth), BG (background texture), BD (background depth) and T (transparency map).”).
Caswell, Rasumussen and Bruls are considered to be analogous art because all pertain to 3D images. It would have been obvious before the effective filing date of the claimed invention to have modified Caswell with the features of “the six channels of information are each sent every other frame by repeatedly sending a first frame with the RGB data and, immediately following the first frame, sending a second frame with the alpha, shadow and depth data” as taught by Bruls. The suggestion/motivation would have been metadata can be additional image information to improve the 3D perceived quality, but also content related information (Bruls, Col 20, lines 6-10).
As to claim 11, claim 9 is incorporated and the combination of Caswell, Rasumussen and Bruls discloses the six channels of information are each sent every other frame by repeatedly sending a first frame with the RGB data and, immediately following the first frame, sending a second frame with the alpha, shadow and depth data (See claim 5 for detailed analysis.).
As to claim 18, claim 16 is incorporated and the combination of Caswell, Rasumussen and Bruls discloses the six channels of information are each sent every other frame by repeatedly sending a first frame with the RGB data and, immediately following the first frame, sending a second frame with the alpha, shadow and depth data (See claim 5 for detailed analysis.).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Koperwas et al. (US Pub 2020/0265638 A1) discloses generating shadows in the physical world that correspond to virtual objects displayed on MR displays.
Cordes et al. (US Pub 2020/0145644 A1) discloses detect and correct lighting artifacts caused by movements of one or more taking camera in a performance area consisting of multiple displays.
Huston et al. (US Pub 2018/0108172 A1) discloses capturing a location based experience at an event including a plurality of mobile devices having a camera employed near a point of interest to capture random, crowdsourced images and associated metadata near said point of interest.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YU CHEN whose telephone number is (571)270-7951. The examiner can normally be reached on M-F 8-5 PST Mid-day flex.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Xiao Wu can be reached on 571-272-7761. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/YU CHEN/Primary Examiner, Art Unit 2613