Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 1/20/2026 has been entered.
Response to Amendment
The amendment filed on 1/20/2026 has been entered and made of record. Claim 1 is amended. Claims 2-3 are cancelled. Claims 1 and 4-10 are pending.
Response to Arguments
Applicant’s arguments with respect to claim 1 have been considered but they are not persuasive.
Applicant asserts that Dixit only mentions server rendering in general, but it does not address the specific latency-induced positioning error problem. To achieve "latency concealment", the server of the mixed reality rendering system recited in claim 1, is configured to generate rendered objects to be displayed at a second time point based on a virtual streaming camera. This virtual streaming camera is generated by the server according to a virtual straight line and a predetermined distance, wherein the virtual straight line is defined by a virtual coordinate of the display device and a virtual coordinate of a native object. The virtual coordinate is converted from the physical coordinate of the display device, which is detected by the display device at the first time point. That is to say, the mixed reality rendering system utilizes relevant data obtained at the first time point to configure the rendering for display at the second time point (p. 3 of Remarks).
Examiner notices that claim 1 recites “the display device detects a first-time-point physical coordinate of the display device at a first time point, and transmits the first-time-point physical coordinate of the display device to the server; the server converts the first-time-point physical coordinate of the display device into a first-time-point virtual coordinate”. Here, the first time point is the time before a physical coordinate of the display device is converted into a virtual coordinate performed on the server, while the second time point is the time that the rendered object by the server based on a virtual streaming camera. It is obvious that the claim language is a well-known client-server processing structure. The argument of latency-induced positioning error problem is merely new matter, which is neither recited in the claim 1, nor supported by the specification.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1 are rejected under 35 U.S.C. 103 as being unpatentable over Dixit et al. (US 2020/0357165) in view of Wang et al. (US 2023/0360317 A1).
As to Claim 1, Dixit teaches A mixed reality rendering system, comprising:
a display device at a user end; and a server at a remote end, wherein the server has a native object built therein, and the native object has a virtual coordinate (Dixit discloses client device 102 and server 106 in Fig 1, see also [0008]. Here, the native object refers to a native principal point. For example, Wang discloses “a native principal point defined relative to an origin of a coordinate system of the digital image” in [0002]. Please note that applicant fails to describe what native object is in the specification.);
wherein the mixed reality rendering system is adapted to perform the following steps:
the display device and the server establish a communication therebetween (Dixit, Fig 1);
the server transmits the virtual coordinate of the native object to the display device; the display device converts the virtual coordinate of the native object into a physical object coordinate (Dixit discloses “the AR view should be highly accurate and the computer generated features of the AR object need to be registered accurately with the physical features of the real world environment. This registration needs to be maintained through viewing perspective changes. In order for the rendered AR object and the real world environment of a user to align properly, the pose and optical properties of the real and virtual cameras must be the same. The position and orientation of the real and virtual objects in some world coordinate system must also be known” in [0039]; “The server sends rendered bitmaps to the client device. The client device then draws received bitmaps by using, for example and without limitation, the anchor points identified by client device. An anchor point is a location of a marker in the real world.” in [0041]);
the display device detects a first-time-point physical coordinate of the display device at a first time point, and transmits the first-time-point physical coordinate of the display device to the server (Dixit discloses “server system 106 may provide AR content to client device based on information received from client device 102. For example, server 106 may determine that a request for AR object has been received based on a detected position, detected position change, or detected/determined context of user and/or client device 102” in [0054]; see also GPS sensors in [0050]);
the server converts the first-time-point physical coordinate of the display device into a first-time-point virtual coordinate (Dixit discloses “To be most useful, the AR view should be highly accurate and the computer generated features of the AR object need to be registered accurately with the physical features of the real world environment. This registration needs to be maintained through viewing perspective changes. In order for the rendered AR object and the real world environment of a user to align properly, the pose and optical properties of the real and virtual cameras must be the same” in [0039], see also Fig 7-8);
the server generates a virtual straight line passing through the first-time-point virtual coordinate of the display device and the virtual coordinate of the native object based on an equation (Dixit discloses “In order for the rendered AR object and the real world environment of a user to align properly, the pose and optical properties of the real and virtual cameras must be the same. The position and orientation of the real and virtual objects in some world coordinate system must also be known… In general, "initialization" refers to techniques used to determine the initial position and orientation (i.e., the initial pose) of a real camera that captures a view of the physical environment to be augmented and using it to initialize a virtual camera with the same pose” in [0039]; “The rendering engine loads the identified AR object and initiates the determined number of virtual cameras at 212. The rendering engine also initiates a distance "D" between the two virtual cameras if the client device is a device that supports stereoscopic view as a function of the distance of the anchor point from the client device 102 (z-coordinate of the anchor point). For example, the distance D may be a function of triangulated distance of anchor point…” in [0059]; see also Fig 3A-3B);
the server generates a first-time-point virtual streaming camera based on the virtual straight line and a predetermined distance, wherein the first-time-point virtual streaming camera has a virtual camera coordinate, and the first-time-point virtual streaming camera is located on the virtual straight line (Dixit discloses “In general, "initialization" refers to techniques used to determine the initial position and orientation (i.e., the initial pose) of a real camera that captures a view of the physical environment to be augmented and using it to initialize a virtual camera with the same pose” in [0039]; “The rendering engine renders the identified AR object as a separate bitmap using each initialized virtual camera, as shown by 214. Each rendered bitmap is then sent to the client device 102 in 216” in [0062]; “The present solution may also be used for sharing an AR environment having both static and dynamic AR objects with another user. Dynamic objects may best be streamed directly from the server…” in [0109]); and
the server renders a rendered object based on the first-time-point virtual streaming camera (Dixit discloses “the rendering of an AR object may be performed on the server side by initializing a virtual camera on the server” in [0040]; “In certain embodiments, the sever 106 may completely render the bitmap(s) for the AR object” in [0065].);
the server transmits the rendered object to the display device; and the display device displays the rendered object at the physical object coordinate at a second time point, wherein the second time point is later than the first time point, and the physical object coordinate at the second time point is different from the physical object coordinate at the first time point (Dixit discloses “In a typical video see-through system, the user sees a live video of a real-world scenario, including one or more particular objects augmented or enhanced on the live video” in [0033]; “According to various aspects of the current disclosure, the computationally expensive rendering of AR objects happens on a server and rendered bitmaps are sent to a client device (or other end node) for display” in [0038]; “server system 106 may provide AR content to client device based on information received from client device 102. For example, server 106 may determine that a request for AR object has been received based on a detected position, detected position change, or detected/determined context of user and/or client device 102” in [0054]. Here, the server generates rendered data at first time point and then client device receives the rendered image transmitted from the server at second time point. See also above “Response to Arguments”.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Dixit with the invention of Wang so as to explain the native principle point defined relative to an origin of a coordinate system of the digital image.
As to Claim 4, Dixit in view of Wang teaches The mixed reality rendering system of claim 1, wherein the mixed reality rendering system is adapted to further perform the following steps:
the display device detects a head orientation of the display device at the first time point, and transmits the head orientation to the server; the server creates a visible cone based on the virtual coordinate of the display device and the head orientation; the server determines whether the native object is located within the visible cone; when the native object is located within the visible cone, the server transmits the rendered object to the display device; and the display device displays the rendered object at a second time point, wherein the second time point is later than the first time point (Dixit discloses “server system 106 may provide AR content to client device based on information received from client device 102. For example, server 106 may determine that a request for AR object has been received based on a detected position, detected position change, or detected/ determined context of user and/or client device 102. Such AR content may also be provided to a client device 102 based upon explicit requests received from the client device 102 or based on a detected and/or recognized object within a field of view of an imaging device associated with the client device 102” in [0054]; “The result of the filtering operations is a sub-catalog. The sub-catalog contains only the markers and AR objects that are made available when the circumstances specified by the contextual information exist. This sub-catalog (rather than the full catalog) is searched when requests for AR content from the client device 102 need to be fulfilled” in [0055];
As to Claim 7, Dixit in view of Wang teaches The mixed reality rendering system of claim 1, wherein a distance between the virtual camera coordinate and the virtual coordinate of the native object equals the predetermined distance (Dixit discloses “HMDs typically utilize a combination of optics and stereopsis to focus virtual imagery at a fixed distance from the HMD (i.e., stereoscopic image) using parallax” in [0034]; “By setting an amount of parallax for an AR object, its virtual distance is implied by the distance at which line of sight to the AR object generated by the left eye display intersects the corresponding line of sight for the right eye display” in [0035]; see also [0059] and Fig 3.)
As to Claim 8, Dixit in view of Wang teaches The mixed reality rendering system of claim 1, wherein the server further has another native object built therein, wherein the another native object has another virtual coordinate; the mixed reality rendering system is adapted to further perform the following steps:
the server generates another virtual straight line passing through the virtual coordinate of the display device and the another virtual coordinate of the another native object based on another equation; the server generates another virtual streaming camera based on the another virtual straight line and another predetermined distance, wherein the another virtual streaming camera has another virtual camera coordinate, and the another virtual streaming camera is located on the another virtual straight line; and the server renders another rendered object based on the another virtual streaming camera (Dixit discloses the left virtual camera 302(L) and the right virtual camera 302 (R) in [0067], see also Fig 3 and [0039, 0062].)
As to Claim 9, Dixit in view of Wang teaches The mixed reality rendering system of claim 8, wherein the mixed reality rendering system is adapted to further perform the following steps:
the server transmits the rendered object and the another rendered object to the display device; and the display device displays the rendered object and the another rendered object at a second time point, wherein the second time point is later than the first time point (Dixit discloses “In other examples, two rendered bitmaps are drawn on each display (left eye display and right eye display) of client device 102 if the client device 102 is configured to support stereoscopic view which are subsequently perceived by the user as a stereoscopic image of the AR object at a desired depth” in [0064]; see also [0077].)
As to Claim 10, Dixit in view of Wang teaches The mixed reality rendering system of claim 9, wherein the mixed reality rendering system is adapted to further perform the following steps:
the server transmits the another virtual coordinate of the another native object to the display device;
the display device converts the another virtual coordinate of the anther native object into another physical object coordinate; and the display device displays the another rendered object at the another physical object coordinate at the second time point (Dixit discloses “To be most useful, the AR view should be highly accurate and the computer generated features of the AR object need to be registered accurately with the physical features of the real world environment. This registration needs to be maintained through viewing perspective changes. In order for the rendered AR object and the real world environment of a user to align properly, the pose and optical properties of the real and virtual cameras must be the same” in [0039]; “the server 106 may initiate two virtual cameras for rendering two images of the AR object –one for the right eye display of the client device 102 and another for the left eye display of the client device 102” in [0077].)
Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Dixit in view of Wang and Traynor et al. (US 2024/0419240 A1).
As to Claim 5, Dixit in view of Wang teaches The mixed reality rendering system of claim 4, wherein the mixed reality rendering system is adapted to further perform the following step:
when the native object is not located within the visible cone, the server does not transmit the rendered object to the display device (Dixit discloses “As discussed above, the server 106 may determine that a request for AR object has been received based on a detected position, detected position change, or detected/ determined context of user and/or client device 102. Such AR content may also be provided to a client device 102 based upon explicit requests received from the client device or based on a detected and/or recognized object within a field of view of an imaging device associated with the client device 102” in [0074]. It is obvious that if the detected and/or recognized object is not located within the field of view, server doesn’t transmit the rendered object to client device. For example, Traynor discloses “For example, if a passenger's direction of attention is pointed in a first direction, content related to the ride profile (e.g., dynamic, static, or other ride profile) and located in a second direction that would not be viewable within the passenger's field of view may be ignored during rendering (i.e., not rendered) to conserve processing power/bandwidth and/or energy” in [0024].)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Dixit and Wang with the invention of Traynor so as to regulate digital content rendering to prevent processing capability on invisible content (Traynor, [0024]).
Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Dixit in view of Wang, Traynor et al. (US 2024/0419240 A1) and Garon et al. (WO 2019/098989 A1).
As to Claim 6, Dixit in view of Wang teaches The mixed reality rendering system of claim 1, wherein the mixed reality rendering system is adapted to further perform the following steps:
the server defines frame boundaries, wherein the frame boundaries contain the native object inside; the server determines whether vertices of the frame boundaries are all outside of the visible cone; and when the vertices of the frame boundaries are all outside of the visible cone, the server does not transmit the rendered object to the display device (Traynor discloses “For example, if a passenger's direction of attention is pointed in a first direction, content related to the ride profile (e.g., dynamic, static, or other ride profile) and located in a second direction that would not be viewable within the passenger's field of view may be ignored during rendering (i.e., not rendered) to conserve processing power/bandwidth and/or energy” in [0024]. Garon further discloses “The system 100 ensures that a user is actually given the opportunity to opt out by determining whether the digital component that includes the opt-out element is within a viewport of the user's device” in [0071]; “When determining the coordinates of the visible boundary of the display, the executable script can identify the x-y coordinates of the vertices of the display. In some implementations, the executable script can obtain the coordinates of the boundaries of the display from the client device itself. For example, parameters of the client device can be accessed by the executable script” in [0088]; “The one or more servers can compare the coordinates of the digital component being presented with visible coordinate boundaries of the display (304). The one or more servers can compare the coordinates of the digital component with the boundary coordinates of the viewport” in [0091]; see also [0092-0093].)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Dixit and Wang with the invention of Traynor so as to regulate digital content rendering to prevent processing capability on invisible content (Traynor, [0024]). The motivation of combining the invention of Garon is to determine if the rendered object is within the viewport (Garon, [0093]).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WEIMING HE whose telephone number is (571)270-1221. The examiner can normally be reached on Monday-Friday, 8:30am-5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tammy Goddard can be reached on 571-272-7773. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/WEIMING HE/
Primary Examiner, Art Unit 2611