Last updated: May 29, 2026

Application No. 18/137,177

MULTIMODAL PROCEDURAL GUIDANCE CONTENT CREATION AND CONVERSION METHODS AND SYSTEMS

Final Rejection §103

Filed

Apr 20, 2023

Priority

Apr 20, 2022 — provisional 63/333,053 +2 more

Examiner

CRADDOCK, ROBERT J

Art Unit

2618

Tech Center

2600 — Communications

Assignee

The United States Of America AS Represented By The Secretary Of The Navy

OA Round

4 (Final)

Interview Optional

— +14.5% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 84% grant rate with +14.5% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 629 resolved cases, 2023–2026

Examiner Intelligence

CRADDOCK, ROBERT J View full profile →

Grants 84% — above average

Career Allowance Rate

530 granted / 629 resolved

+22.3% vs TC avg

Moderate +14% lift

Without

With

+14.5%

Interview Lift

resolved cases with interview

Typical timeline

2y 5m

Avg Prosecution

14 currently pending

Career history

650

Total Applications

across all art units

Statute-Specific Performance

§101

3.2%

-36.8% vs TC avg

§103

66.5%

+26.5% vs TC avg

§102

14.3%

-25.7% vs TC avg

§112

2.4%

-37.6% vs TC avg

Black line = Tech Center average estimate • Based on career data from 629 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments filed 08/26/2025

On page 10 (as written), the applicant’s arguments regarding the previously pending objections are persuasive.

On page 10-14 (as written), the applicant argued the previous rejection. This argument is not persuasive and is addressed in the rejection below. 

The applicant hasn’t actually pointed to a particular limitation with the claims, nor are there any alleged difference pointed out from the prior art and the claims. The applicant copied and pasted the independent claims, then merely alleged the limitations are not disclosed and has made comments about Sharma et al, including the abstract, for which it doesn’t appear there is any relation to the actual rejection or the claims themselves because they have not been discussed. 

Under the above rationale the applicant’s arguments are not persuasive. 

Claim Objections
Claim 11 is objected to because of the following informalities: The fifth from the last line states, “Aligning” and should be “aligning”.  Appropriate correction is required.

Claim 21 is objected to because of the following informalities: The fifth from the last line states, “Aligning” and should be “aligning”.  Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-21 are rejected under 35 U.S.C. 103 as being unpatentable over Sharma et al. (US Patent No. 11,748,679).

Regarding claim 1, Sharma teaches a method  (See col. 3 line 14 – 25) for converting interactive modality information into a data structure by a mixed reality system comprising: a virtual reality system, an augmented reality system, and a mixed reality controller operatively associated with blending operational elements of both the virtual reality system and augmented reality system, the data structure configured for multimodal distribution and parallel content authoring with a plurality of modalities associated with the multimodal distribution, the method comprising (See col. 6 lines 1- 15: AR/VR/MR. Col. 3 line 14-45, col. 13-line 54-67 : collaborative workspace. The workspace has multiple modalities for collaboration. ):

a) acquiring interactive modality information by importing or opening one of a document file (col. 3 line 61 - 67), a video file (col. 16 line 3 – 61: The red, green and blue frames can be a video file, ), a voice recording file in a conversion application (col 12 line 36 – 49: voice command), or an interactive modality data file including one or more of a virtual reality data file, an augmented reality data file, and a 2D virtual environment data file (See col. 11 line 47- col. 12 line 28);

b) parallel authoring, by an indirect or a direct measures sensor, wherein the sensor creates a data structure based at least in part on information collected from a mixed reality environment (See abstract: sensing data that is direct/indirect measured. Fig. 31: element 3106, initial sensing, may be considered direct, the other analyzed and determined insights or what is prioritized may be considered to be indirect. Col. 7 line 47-65: discloses different indirect measures. Col. 19 line 37 – 50, multiple processors for parallel authoring);

c) identifying specific steps within a procedure included in the acquired interactive modality information through manual selection, programmatically, or by observing user interactions in an interactive modality (See col. 11 line 47- col. 12 line 28);

d) parsing the identified steps into distinct components using Al-based machine learning algorithms, advanced human toolsets, or a combination of both (See col. 11 line 47- col. 12 line 28:JSON: AI);

e) categorizing the parsed components based on their characteristics, the characteristics including one or more of verbs, objects, tools used (See col. 16 lines 23 – 31: voice commands are considered to have characteristics such as being a verb as they are defining an action, considered to be objects or tools), and reference images, using Al-based classification methods (col. 14 line 4 – 41: deep learning);

f) generating images or videos directly from the importing or opening of the interactive modality information and or the interactive modality file or known information about a step and a context of the step within the procedure (col. 14 line 4 – 41: deep learning);

g) storing the parsed and categorized components, and the generated images or videos, in a data structure designed for multimodal distribution (col. 14 line 4 – 41: deep learning); and

h) accessing and editing the interactive modality information in another modality (See Fig. 13, col. 16 line 32 – 38, col. 19 line 10 – 56: extended reality has multiple modalities of different realities within it); 
i) aligning and optimizing procedural guidance content, wherein the procedural guidance content is registered with real-world objects using artificial intelligence algorithms (See col. 12 Line 29 – 35: teaches an aligning system in. Furthermore the  See col. 11 line 47- col. 12 line 28:JSON: AI. col. 14 line 4 – 41: deep learning) but doesn’t explicitly disclose and j) reviewing and validating, by a human or a process, procedural guidance content for incomplete data and correct description.

Sharma teaches and j) reviewing and validating, by a human or a process, procedural guidance content for incomplete data and correct description (See col. 6 lines 1 – 15, “Extended reality, which encompasses virtual reality (VR), augmented reality (AR), and mixed reality (MR), when coupled with software analytics, may be implemented as
disclosed herein for utilizing project workspaces providing such immersive insights which may be situational, view sensitive, contextual and real-time. Immersive experiences may leverage affordances of natural human perception spatial
memory, and navigation and manipulation for better comprehension of 3D visualizations and enhanced creativity. These experiences may also be non-intrusive, which may
provide lower disruption in day-to-day processes. Moreover, the placement and type of the insights may be highly customizable as well, thus providing a seamless flow of
15 awareness throughout the workspace.” Col. 13 line 27 – 41, “FIG. 5 shows an example of the specific people and aisle category of visualization as a live screen grab through the device 106. The view of FIG. 5 may show pertinent insights as a 3D informational panel 500 anchored to the partitions in an aisle. This may help the user (e.g., a project lead) to quickly identify the progress and needs for guidance even
before reaching a project team members' desk, without disturbing the project team member(s). For the example of FIG. 5, the insights may now change to a subject-specific insight, rather than team-level insight. Further, insight prioritization components may be utilized herein as it is a subject-specific insight. The insights shown in FIG. 5 may be prioritized as disclosed herein with respect to FIGS.18-29.” Col. 3 lines 14 – col. 4 line 11, correct description such as a standard reports/documents/blueprints. Incomplete data may be email or chats. Also see MPEP 2173.05(h)).

Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Sharma in view of Sharma as combining the teachings with the known technique as it is ready for improvement and would yield predicable results. Furthermore combining the teachings would have been advantageous as doing so would improve efficiency and accuracy of Sharma.

Regarding claim 2, Sharma teaches the method of claim 1, further comprising: leveraging Al-based technology to generate 3D scene information through prompts or extracting relevant visual information from existing multimedia sources (col. 14 line 4 – line 41 and claim 1).

Regarding claim 3, Sharma teaches the method of claim 1, further comprising:
creating a 3D representation of a target physical system (col. 12 line 6-49);
receiving a part selection from an editor of the 3D representation, the part selection from one of a plurality of parts included in the target physical system (col. 12 line 6- col 13 line 5);
collecting part actions from the editor, the part actions associated with actions to be performed on the selected part (See Fig. 14-16);
creating queued annotations for the part actions, wherein the queued annotations are to be displayed in a 3D environment with respect to the 3D representation of the target physical system, and wherein at least one of the queued annotations includes a camera position recording based on a type of the corresponding part action and a location of the target system part (See col. 10 line 37-55);
collecting and associating augmented reality data with the queued annotations (See col. 10 line 37-55);
publishing a data structure bundle including a data set for generation of the queued annotations, the data set parsable to create mixed reality content (See col. 10 line 37-55, Fig. 14-16); and
a mixed reality system creating and presenting to a user content including the queued annotations from the data set, where the user interacts with the target physical system and parts included in the parts selection according to the queued annotations (See col. 10 line 37-55, Fig. 14-16).

Regarding claim 4, Sharma teaches the method of claim 1, further comprising:
collecting one or both of a text description for at least one of the queued annotations and an audio description for at least one of the queued annotations (See col. 10 line 37-55, Fig. 14-16: the description of the item can be considered an audio description as well. Also see MPEP 2173.05(h)).

Regarding claim 5, Sharma teaches the method of claim 1, further comprising:
utilizing a large language model (LLM) within an end application to construct language guidance and other generative content based on a parsed data structure which includes generated images, videos, and/or multimedia content, and considers context,
user preferences, and specific requirements (See col. 12 line 10 – 49, col. 14 line 4-41 and claim 1);
leveraging additional Al-based generative models to create or refine the images, videos, and/or multimedia content that complements the tailored language guidance (See col. 12 line 10 – 49, col. 14 line 4-41 and claim 1);
dynamically adapting the generated language guidance and other generative content to the user's interactions, preferences, or changes in the data structure to provide a user personalized experience (See col. 12 line 10 – 49, col. 14 line 4-41 and claim 1); and
outputting to one or more devices a constructed language guidance in the form of one or both of text and voice, and outputting to the one or more devices the associated images, videos, and multimedia content based on the user preferences, a device's capabilities, and a context in which the guidance is being provided (See col. 12 line 10 – 49, col. 14 line 4-41 and claim 1).

Regarding claim 6, Sharma teaches the method of claim 1, wherein the queued annotations are stored such that the queued annotations can be translated into at least one medium selected from a group of a 2D medium, and a 3D medium, wherein the queued annotations are presented in the at least one selected medium (Fig. 14-16: the annotations can be considered to be 2d or 3d. Also see MPEP 2173.05(h))).

Regarding claim 7, Sharma teaches the method of claim 1, wherein the queued annotations are stored such that the queued annotations can be translated into at least one format selected from a group of a document format, wherein the queued annotations are presented in the at least one selected format (Fig. 14-16: the annotations is considered to be a document format. Also see MPEP 2173.05(h)).

Regarding claim 8, Sharma teaches the method of claim 1, wherein a part selection and a part actions are received from an editors in a mixed reality environment (See Fig. 14-16).

Regarding claim 9, Sharma teaches the method of claim 1, wherein an editor works collaboratively in at least one environment selected from a group of a mixed reality environment and a desktop environment (See col. 4 line 46 - 56).

Regarding claim 10, Sharma teaches the method of claim 1, wherein the method for parallel content authoring publishes a data structure bundle comprising a data set for generation of the queued annotations, and the method for parallel content authoring publishes discrete individual outputs including a text, AR instructions (Fig. 14-16) and video (col. 16 line 3 – 61).

Claims 11-21 recite similar limitations to that of claims 1-10 and thus are rejected under similar rationale as detailed above.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ROBERT J CRADDOCK whose telephone number is (571)270-7502. The examiner can normally be reached Monday - Friday 10:00 AM - 6 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Devona E Faulk can be reached at 571-272-7515. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/ROBERT J CRADDOCK/Primary Examiner, Art Unit 2618

Read full office action

Prosecution Timeline

Show 3 earlier events

Jul 16, 2025

Final Rejection mailed — §103

Aug 07, 2025

Request for Continued Examination

Aug 08, 2025

Response after Non-Final Action

Aug 26, 2025

Response Filed

Aug 26, 2025

Applicant Interview (Telephonic)

Aug 26, 2025

Non-Final Rejection mailed — §103

Aug 28, 2025

Examiner Interview Summary

Dec 29, 2025

Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/421,528

Patent 12633025

CAMERA MAPPING IN A VIRTUAL EXPERIENCE

2y 3m to grant Granted May 19, 2026

18/463,320

Patent 12633066

METHOD FOR RENDERING VIRTUAL OBJECT, HOST, AND COMPUTER READABLE STORAGE MEDIUM

2y 8m to grant Granted May 19, 2026

18/533,705

Patent 12632506

OFFLOADING SLAM PROCESSING TO A REMOTE DEVICE

2y 5m to grant Granted May 19, 2026

18/589,098

Patent 12620164

USING NEURAL RADIANCE FIELDS FOR LABEL EFFICIENT IMAGE PROCESSING

2y 2m to grant Granted May 05, 2026

18/136,768

Patent 12597214

SCANNABLE CODES AS LANDMARKS FOR AUGMENTED-REALITY CONTENT

2y 11m to grant Granted Apr 07, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

5-6

Expected OA Rounds

84%

Grant Probability

99%

With Interview (+14.5%)

2y 5m (~0m remaining)

Median Time to Grant

High

PTA Risk

Based on 629 resolved cases by this examiner. Grant probability derived from career allowance rate.

MULTIMODAL PROCEDURAL GUIDANCE CONTENT CREATION AND CONVERSION METHODS AND SYSTEMS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email