Prosecution Insights
Last updated: April 19, 2026
Application No. 18/172,054

AUGMENTATION OF VOLUMETRIC VIDEO WITH NON-VISUAL INFORMATION IN A VIRTUAL REALITY ENVIRONMENT

Non-Final OA §103
Filed
Feb 21, 2023
Examiner
HUYNH, THANG GIA
Art Unit
2611
Tech Center
2600 — Communications
Assignee
International Business Machines Corporation
OA Round
1 (Non-Final)
76%
Grant Probability
Favorable
1-2
OA Rounds
2y 4m
To Grant
99%
With Interview

Examiner Intelligence

Grants 76% — above average
76%
Career Allow Rate
19 granted / 25 resolved
+14.0% vs TC avg
Strong +50% interview lift
Without
With
+50.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 4m
Avg Prosecution
21 currently pending
Career history
46
Total Applications
across all art units

Statute-Specific Performance

§101
2.3%
-37.7% vs TC avg
§103
73.9%
+33.9% vs TC avg
§102
7.7%
-32.3% vs TC avg
§112
11.5%
-28.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 25 resolved cases

Office Action

§103
CTNF 18/172,054 CTNF 100902 DETAILED ACTION Notice of Pre-AIA or AIA Status 07-03-aia AIA 15-10-aia The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA. Claim Rejections - 35 USC § 103 07-20-aia AIA The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. 07-21-aia AIA Claim s 1, 8, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Shahdi et al. (US 10477157 B1) (Hereinafter referred to as Shahdi) in view of Wanbo et al. (US 20240112414 A1) (Hereinafter referred to as Wanbo) . Regarding Claim 1, Shahdi discloses A method comprising: (See Abstract, “Aspects of the disclosed apparatuses, methods and systems . . .”) receiving, via a computational device of a virtual reality system, a video that captures visual information of a physical environment; (See Col 3 Lines 64 – Col 4 Lines 5, “In one general aspect, an augmented reality system (ARS) or a virtual reality system (VRS) includes one or more sensors, at least one computational device , and other electronic, optical, and mechanical components that provide the display of digital information (e.g., virtual elements) to augment the physical environment of a user (e.g., real world elements as perceived directly by the user in an ARS and/or by a video feed of the user's environment captured in real time in a VRS).”) receiving, via the computational device of the virtual reality system, non-visual information of the physical environment; and (See Col 1 Lines 38 – 47, “Aspects of the disclosed apparatuses, methods, and systems describe various methods, system, components, and techniques that provide a sensor array . In one general aspect, a sensor array includes multiple sensors and is configured and arranged to enable one or more augmented reality applications in an augmented reality system. For example, the sensor array is arranged and configured to acquire different types of data that are combined and processed using computer vision processes in order to augment a user's physical environment with virtual information.” Also see Col 9 Lines 66 – Col 10 Lines 2, “In one example, the sensor array 101 may include a thermal sensor 129, such as a thermometer or a thermal camera (e.g., where each pixel of the image sensor senses temperature ).” In this case, the temperature can be considered as “non-visual information” of the physical environment. ) integrating the video with the non-visual information to generate a video augmented with non-visual information in which navigation is performed in the virtual reality system. (See Col 1 Lines 43 – 47, “For example, the sensor array is arranged and configured to acquire different types of data that are combined and processed using computer vision processes in order to augment a user's physical environment with virtual information .” See Col 10 Lines 47 – 51, “The sensor array also includes a computational system designed to transfer the digital information that each sensor generates to other computational systems for further processing, for example, by computer vision algorithms and applications.” See Col 4 Lines 12-17, “In one example, an ARS may be implemented as . . . a video-see through system (e.g., overlaying digital information on top of a video feed ).” In this case, Shahdi teaches to augment and overlay (integrate) the digital information from the sensors, which would thus include temperature information (non-visual information), with the video. Finally, see Col 4 Lines 39 – 53, “To augment the user's environment with digital information , the ARS processes information about the user's real world environment. . . In addition, information about the position and movements of a user in relation to the real world elements is used.” Here Shahdi teaches that users are able to move in the AR/VR environment, and would thus imply being able to perform “navigation”.) However, Shahdi fails to explicitly disclose receiving, via a computational device of a virtual reality system, a volumetric video that captures visual information of a physical environment; . . . integrating the volumetric video with the non-visual information to generate a volumetric video augmented with non-visual information in which navigation is performed in the virtual reality system. Wanbo teaches receiving, via a computational device of a virtual reality system, a volumetric video that captures visual information of a physical environment; (See [0175], “For example, volumetric video of a mapped real-world room can be captured (e.g., including objects and local users in the real-world room) and rendered to remote users during the shared XR environment . . . The volumetric video can be generated using video from several synchronized cameras (e.g., XR devices positioned in the real-world room) that capture the room from different perspectives. In some implementations, the field of view for the several cameras comprise overlapping portions.” Note that XR is a collective term that also includes virtual reality (See [0004]).) integrating the volumetric video with the non-visual information to generate a volumetric video augmented with non-visual information in which navigation is performed in the virtual reality system. (See [0175] teaching to specifically use a volumetric video for an XR environment. Also see [0004], “Various XR environments exist, allowing representations of users to move about and speak with one another.” Once again, since user can move in the XR environment, that means that “navigation” can be performed. In combination with Shahdi already teaching to integrate the video with non-visual information, the above limitation is taught.) It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Shahdi with Wanbo to include using a volumetric video instead of a normal video for VR. The motivation to combine Shahdi with Wanbo would have been obvious as both arts are within the same field of virtual reality (See Wanbo Abstract). The benefit of using volumetric video for VR is that it enables a more realistic experience as it allows users to view objects and environment from any angle in real-time. This is shown by Wanbo [0175] teaching how the volumetric video is generated from several synchronized cameras capturing the room from different perspectives. Regarding Claim 8, Shahdi in view of Wanbo disclose A system, comprising: (See Shahdi Abstract, “Aspects of the disclosed apparatuses, methods and systems . . .” a memory; and a processor coupled to the memory, wherein the processor performs operations, the operations comprising: (See Shahdi Col 23 Lines 40 –42, “Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both . . .”) receiving, in a virtual reality system, a volumetric video that captures visual information of a physical environment; receiving, in the virtual reality system, non-visual information of the physical environment; and integrating the volumetric video with the non-visual information to generate a volumetric video augmented with non-visual information in which navigation is performed in the virtual reality system. (The above limitations are similar to those of Claim 1 and is therefore rejected under a similar rationale as that of Claim 1.) Regarding Claim 15, Shahdi in view of Wanbo disclose A computer program product, the computer program product comprising a computer readable storage medium having computer readable program code embodied therewith, the computer readable program code when executed is configured to perform operations, the operations comprising: (See Shahdi Col 23 Lines 10-19, “The techniques can be implemented as a computer program product , i.e., a computer program tangibly embodied in a non-transitory information carrier, for example, in a machine-readable storage device, in machine-readable storage medium, in a computer-readable storage device or, in computer-readable storage medium for execution by, or to control the operation of, data processing apparatus or processing device, for example, a programmable processor, a computer, or multiple computers.”) receiving, via a computational device of a virtual reality system, a volumetric video that captures visual information of a physical environment; receiving, via the computational device of the virtual reality system, non-visual information of the physical environment; and integrating the volumetric video with the non-visual information to generate a volumetric video augmented with non-visual information in which navigation is performed in the virtual reality system. (The above limitations are similar to those of Claim 1 and is therefore rejected under a similar rationale as that of Claim 1.) 07-21-aia AIA Claim s 2-3, 6, 9-10, 13, 16-17, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Shahdi in view of Wanbo and in further view of Duff et al. (US 20200034501 A1) (Hereinafter referred to as Duff) . Regarding Claim 2, Shahdi in view of Wanbo discloses The method of claim 1, wherein the non-visual information includes a thermal imagery of the physical environment, and wherein the thermal imagery is captured by an thermal camera. (See Shahdi Col 9 Lines 66 – Col 10 Lines 2, “In one example, the sensor array 101 may include a thermal sensor 129, such as a thermometer or a thermal camera (e.g., where each pixel of the image sensor senses temperature ).” Note that “each pixel of the image sensor senses temperature” would imply “a thermal imagery”) However, Shahdi in view of Wanbo fails to explicitly disclose wherein the thermal imagery is captured by an infrared camera. Duff is an art within the field of AR headgear (See Abstract, “Methods and apparatus for presenting data to a user with an augmented reality headgear that has been oriented in a direction based upon unique automated generation of a vector.”) Duff teaches wherein the thermal imagery is captured by an infrared camera. (See [0011], “In some examples, a data capture device such as an infrared stereoscopic camera system, may be attached to a wearable AV Headgear.” See [0091], “The Augmented Virtual Model exists in parallel to a physical structure in that the AVM includes virtual representations of physical structures and additionally receives and aggregates data relevant to the structures over time.” See [0105], “By way of non-limiting example, on site data capture may include designation of an XYZ reference position and one or more of: image capture; infra-red capture ; Temperature;)” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Shahdi in view of Wanbo with Duff to include non-visual information such as a thermal imagery captured by an infrared camera. The motivation to combine Shahdi in view of Wanbo with Duff would have been obvious as all three are within the same of virtual reality (See Duff Abstract). Note that similar to Shahdi, Duff also teaches presenting data in VR (See Duff Abstract) and using sensors to gather data (See Duff [0091]). The benefit of using an infrared camera is that it offer better visual identification and sharper more detailed images. Regarding Claim 3, Shahdi in view of Wanbo and Duff discloses The method of claim 2, wherein the non-visual information includes an airflow information within the physical environment, and wherein the airflow information is additionally captured by an airflow measurement device. (See Duff [0116], “By way of non-limiting example, on site data capture may include designation of an XYZ reference position and one or more of: image capture; infra-red capture; Temperature; Humidity; Airflow ; Pressure/tension;”. Note that an “airflow measurement device” would be implied to exist to capture on site Airflow measurements. The motivation to combine would have been similar to that of Claim 2 rejection motivation.) Regarding Claim 6, Shahdi in view of Wanbo and Duff disclose The method of claim 1, wherein a user is able to perceive the physical environment including at least a temperature and an airflow information in addition to the visual information within the physical environment in the virtual reality system, and wherein the user is able to navigate within the volumetric video augmented with non-visual information in the virtual reality system. (See Shahdi Col 1 Lines 43 – 47 teaching acquiring different types of sensor data used to augment the user’s physical environment with virtual information and Col 4 Lines 12-17 teaching to overlay the information on top of a video feed. Also see Shahdi Col 4 Lines 39 – 53 teaching that users are able to move in the AR/VR environment and thus are able to “navigate”. See Wanbo [0175] teaching using specifically a volumetric video for an XR environment. Additionally see Wanbo [0004] teaching that users can move in the XR environment. See Duff [0116], “By way of non-limiting example, on site data capture may include designation of an XYZ reference position and one or more of: image capture; infra-red capture; Temperature ; Humidity; Airflow ; Pressure/tension;” The motivation to combine would have been similar to that of Claim 2 rejection motivation.) Regarding Claim 9, Claim 9 contains similar limitations as to Claim 2 and is therefore rejected under a similar rationale as that of Claim 2. Regarding Claim 10, Claim 10 contains similar limitations as to Claim 3 and is therefore rejected under a similar rationale as that of Claim 3. Regarding Claim 13, Claim 13 contains similar limitations as to Claim 6 and is therefore rejected under a similar rationale as that of Claim 6. Regarding Claim 16, Claim 16 contains similar limitations as to Claim 2 and is therefore rejected under a similar rationale as that of Claim 2. Regarding Claim 17, Claim 17 contains similar limitations as to Claim 3 and is therefore rejected under a similar rationale as that of Claim 3. Regarding Claim 20, Claim 20 contains similar limitations as to Claim 6 and is therefore rejected under a similar rationale as that of Claim 6 . 07-21-aia AIA Claim s 4-5, 11-12, and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Shahdi in view of Wanbo and Duff and in further view of Asghar et al. (US 11562550 B1) (Hereinafter referred to as Asghar) . Regarding Claim 4, Shahdi in view of Wanbo and Duff fails to explicitly disclose The method of claim 3, wherein the non-visual information includes a radar imagery, and wherein the radar imagery is captured by a radar that determines location and orientation of objects in the physical environment. Asghar teaches wherein the non-visual information includes a radar imagery, and wherein the radar imagery is captured by a radar that determines location and orientation of objects in the physical environment. (See Col 1 Lines 29 – 39, “Extended reality (XR) devices are another example of devices that can include one or more cameras. XR devices can include augmented reality (AR) devices , virtual reality (VR) devices , mixed reality (MR) devices, or the like. . . In general, an AR device can implement cameras and a variety of sensors to track the position of the AR device and other objects within the physical environment . An AR device can use the tracking information to provide a user of the AR device a realistic AR experience.” See Col 17 Lines 29 – 32, “In some cases, the image sensor 104A and/or the image sensor 104N can include an RF sensor, such as a radar sensor , a LIDAR sensor and/or an IR sensor configured to perform RF and/or IR imaging of the environment .”) It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Shahdi in view of Wanbo and Duff with Asghar to include using radar imagery captured by a radar that determine the location and orientation of objects in the physical environment. The motivation to combine Shahdi in view of Wanbo and Duff with Asghar would have been obvious as Asghar is also an art that is directed towards virtual reality devices with sensors (See Col 1 Lines 29 – 39). As stated by Asghar in Col 1 Lines 37 – 39, the AR device can use the tracking information from the sensors to provide the user a more realistic AR experience, and thus this would be a potential benefit and reason to combine Asghar and its radar sensor used to track the surrounding objects. Regarding Claim 5, Shahdi in view of Wanbo, Duff, and Asghar disclose The method of claim 4, wherein the non-visual information includes a sound level at different locations in the physical environment, and wherein the sound level is captured additionally by a sound level measurement device that determines the sound level at the different locations in the physical environment. (See Duff [0116], “By way of non- limiting example, on site data capture may include designation of an XYZ reference position and one or more of: image capture; infra-red capture; Temperature; Humidity; Airflow; Pressure/tension; Electromagnetic reading; Radiation reading; Sound readings (i.e. level of noise , sound pattern to ascertain equipment running and/or state of disrepair)” Also see Duff [0220], “ Noise levels are another type of vibrational measurement which is focused on transmission through the atmosphere of the Structure. . . Thus, measurement of ambient sound with directional microphones or other microphonic sensing types may be used to elucidate the nature and location of noise emanations .” The motivation to combine would have been similar to that of claim 4 rejection motivation.) Regarding Claim 11, Claim 11 contains similar limitations as to Claim 4 and is therefore rejected under a similar rationale as that of Claim 4. Regarding Claim 12, Claim 12 contains similar limitations as to Claim 6 and is therefore rejected under a similar rationale as that of Claim 5. Regarding Claim 18, Claim 18 contains similar limitations as to Claim 4 and is therefore rejected under a similar rationale as that of Claim 4. Regarding Claim 19, Claim 19 contains similar limitations as to Claim 5 and is therefore rejected under a similar rationale as that of Claim 5 . Allowable Subject Matter 12-151-08 AIA 07-43 12-51-08 Claim s 7 and 14 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. 13-03-01 AIA The following is a statement of reasons for the indication of allowable subject matter: Regarding Claim 7, the cited prior art does not disclose or render obvious the combination of elements cited in the claims as a whole. Specifically, the cited prior art fails to disclose or render obvious the limitations: wherein a learning mechanism improves a selection of the non-visual information to integrate with the volumetric video to generate the volumetric video augmented with non-visual information . Thus Claim 7 contains allowable subject matter. Regarding Claim 14, Claim 14 contains similar limitations as to Claim 7 and therefore contains similar allowable subject matter as that of Claim 7 . Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to THANG G HUYNH whose telephone number is (571)272-5432. The examiner can normally be reached Mon-Thu 7:30am-4:30pm EST | Fri 7:30am-11:30am EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached at (571)272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /T.G.H./Examiner, Art Unit 2611 /KEE M TUNG/Supervisory Patent Examiner, Art Unit 2611 Application/Control Number: 18/172,054 Page 2 Art Unit: 2611 Application/Control Number: 18/172,054 Page 3 Art Unit: 2611 Application/Control Number: 18/172,054 Page 4 Art Unit: 2611 Application/Control Number: 18/172,054 Page 5 Art Unit: 2611 Application/Control Number: 18/172,054 Page 6 Art Unit: 2611 Application/Control Number: 18/172,054 Page 7 Art Unit: 2611 Application/Control Number: 18/172,054 Page 8 Art Unit: 2611 Application/Control Number: 18/172,054 Page 9 Art Unit: 2611 Application/Control Number: 18/172,054 Page 10 Art Unit: 2611 Application/Control Number: 18/172,054 Page 11 Art Unit: 2611 Application/Control Number: 18/172,054 Page 12 Art Unit: 2611 Application/Control Number: 18/172,054 Page 13 Art Unit: 2611
Read full office action

Prosecution Timeline

Feb 21, 2023
Application Filed
Dec 18, 2023
Response after Non-Final Action
Mar 23, 2026
Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12597100
DEEP IMAGE DELIGHTING
2y 5m to grant Granted Apr 07, 2026
Patent 12586309
MACHINE-LEARNING METHOD ON VECTORIZED THREE-DIMENSIONAL MODEL AND LEARNING SYSTEM THEREOF
2y 5m to grant Granted Mar 24, 2026
Patent 12581083
METHOD, DEVICE, AND COMPUTER PROGRAM PRODUCT FOR COMPRESSING TWO-DIMENSIONAL IMAGE
2y 5m to grant Granted Mar 17, 2026
Patent 12560450
METHOD AND SERVER FOR GENERATING SPATIAL MAP
2y 5m to grant Granted Feb 24, 2026
Patent 12554815
DEVICES, METHODS, AND GRAPHICAL USER INTERFACES FOR AUTHORIZING A SECURE OPERATION
2y 5m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
76%
Grant Probability
99%
With Interview (+50.0%)
2y 4m
Median Time to Grant
Low
PTA Risk
Based on 25 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month