Last updated: April 19, 2026

Application No. 19/072,883

MULTI-MODAL CONSISTENCY VERIFICATION AND HARMONIZATION MODELS

Non-Final OA §103

Filed

Mar 06, 2025

Examiner

AUGUSTINE, NICHOLAS

Art Unit

2178

Tech Center

2100 — Computer Architecture & Software

Assignee

Reve AI Inc.

OA Round

3 (Non-Final)

Interview Optional

— +27.8% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 814 resolved cases, 2023–2026

Examiner Intelligence

AUGUSTINE, NICHOLAS View full profile →

Grants 73% — above average

Career Allow Rate

596 granted / 814 resolved

+18.2% vs TC avg

Strong +28% interview lift

Without

With

+27.8%

Interview Lift

resolved cases with interview

Typical timeline

3y 9m

Avg Prosecution

44 currently pending

Career history

858

Total Applications

across all art units

Statute-Specific Performance

§101

9.6%

-30.4% vs TC avg

§103

36.2%

-3.8% vs TC avg

§102

50.1%

+10.1% vs TC avg

§112

2.3%

-37.7% vs TC avg

Black line = Tech Center average estimate • Based on career data from 814 resolved cases

Office Action

§103

DETAILED ACTION
A.	This action is in response to the following communications: Request for Continued Examination filed 01/12/2026.
B.	Claims 1-21 remains pending.
 
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 01/12/2026 has been entered.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Howard, Todd A (US Pub. 2021/0232632 A1), herein referred to as “Howard” in view of Malkiel, Itzik et al. (US Pub. 2025/0238629 A1), herein referred to as “Malkiel”.

As for claims 1, 16 and 19, Howard teaches.  A system and corresponding method of claim 16 and non-transitory computer readable medium of claim 19 storing instructions that when executed  by a processor, programs the processor to, comprising: a processor programmed to (par. 243 hardware environment used to execute application):

automatically generate content comprising text content, visual content, and/or audio content (fig. 4A, item 401 a beholder (or user of the system) makes a request; par. 34-35 subject matter prompts (herein SMP) are comprised of text, audio and visual input) in response to an input prompt (paragraphs 120-122  the user inputs (prompts) images and captions (text) is automatically generated based upon prompts. In paragraph 192 the user (beholder) prompts (requests) and automated audio is generated. In paragraph 150-151 user (beholder) prompts (request) communicates multiple, tangible data structures related to the subject matter desired in the virtual experience, a beholder request may be initiated, generated, or modified in a variety of ways, including, but not limited to, as a result of a user's interaction with user interface elements of a user experience device, application, web site, or mobile device “app,” as described in regard to FIG. 1 and FIG. 6. The beholder request can also be generated via automated processes resulting from automated agents;  in such the subject matter is prompted into the system and virtual experience including images, captions and sound is automatically generated for beholder (user);

perform, based on a harmonization model and/or consistency model, a harmonization check and/or a consistency check on the content (par. 156-157, fig. 4B, item 414, par. 4 For example, techniques and systems described herein advantageously provide technical methods for the extraction and separation of information facets from the base content returned from element sources, thus transforming content and metadata from the technical form usually returned from an element source (e.g., JSON, XML, HTML, images, video files, document files, and other proprietary formats) into discrete content elements that can be individually and separately used to construct a facet-segmented repository (and, ultimately, a bespoke virtual experience container). Searching a facet segmented repositories (FCRs) of selected target identities (prompts from user) to retrieve discrete content elements(DCEs) (objects) where primary concept aligns with target primary and secondary concepts (prompts) and stored separately as Cache set A and B); 

recognize, based on the harmonization check and/or the consistency check, a conflict to be corrected (fig. 4B, 416 par. 158 Supplement, refine, and/or reduce cache set A 413 using cache set B 415, forming cache set C 417. Cache set C contains discrete content elements (DCEs) having content with a particular conceptual classification stemming from the beholder request and related to the subject matter context; the system is able to add to make refinements too and/or reduce Cache set A with Cache set B to create new Cache set C);

identify a property of the content that should be changed based on the recognized conflict (Cache set C created; par. 169 FIG. 4A, a virtual experience container may be assembled from the selected discrete content elements in accordance with the user experience device parameters (403)); and

generate a corrective action based on the property of the content that should be changed (par. 215 FIG. 4A, the virtual experience container may be provided to the user experience device (404)) .

Howard does not specifically teach automatic prompt generating; however in the same field of endeavor Malkiel teaches automatically generate, via a prompt generator (par. 95 automatic submitting a forward prompt), a verification prompt based on contextual and/or semantic information associated with the content (par. 95 804 obtaining primary answer generated by first language model and submitting a backward prompt for that is used as a function of verification, (determine hallucination)); execute a language model with the verification prompt to determine whether one or more portions of the content are inconsistent across one or more modalities (par. 95 acquiring 808 at least one candidate question generated by the backward language model set and getting a digital vector similarity measurement between primary question and candidate question)

perform, based on a harmonization model and/or consistency model that accesses output of the executed language model, a harmonization check and/or a consistency check on the content, the harmonization check and/or the consistency check comprising a comparison of at least one of: (i) different versions of a content element using a difference representation and/or (ii) numerical embedding representations of the content element generated at different times; recognize, based on the harmonization check and/or the consistency check, a conflict to be corrected including at least one of a semantic conflict, a visual conflict, or a mood conflict (par. 95 assigning 812 a digital hallucination extent to the primary answer computed from at least digital vector similarity measurement  this is collected without any input 452  which represents human review 454 of either primary answer or digital vector similarity measurement by use of the standalone language model, first language model, backward language model and other text embedding model).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Malkiel into Howard since Malkiel suggests in paragraphs 3 and 4 Language models can be useful for computational tasks such as machine translation between natural languages, generation of text in natural languages, generation of source code text in programming languages, device control, image generation, information retrieval, information classification, information summarization, and many other computational tasks that involve communication efforts in written or spoken or other form. The capabilities of many language models depend at least in part on the training data which is used to train the model in question. The training data is embedded into the language model in an encoded form, so that more training on a given topic tends to produce a model which performs better on that topic than another language model which is less well trained on the topic. However, despite advances, language models sometimes produce undesirable, irrelevant, or insufficient results. Accordingly, improvements in technical areas involving language models would be beneficial.

As for claims 2, 17 and 20, Howard teaches. The system of claim 1, wherein the content comprises multi-modal content and wherein to recognize the conflict to be corrected, the processor is programmed to: identify an inconsistency between text in the multi-modal content and a visual in the multi-modal content (par. 46 the SMP is too broad and fixes the prompt from beholder; a subject matter prompt may be developed iteratively or interactively with the beholder, for example, to narrow down an open-ended prompt or suggest options to a beholder. For instance, the beholder request component 105 may, at the direction of the virtual experience service 120, present additional options on the UED 100 to the beholder 101 after an initial beholder request has been formulated. For example, if the prompt “Dad as a soldier” elicits too much content, the beholder request component 105 might render a prompt asking the beholder “How about Dad's experiences during the Korean War?”).

As for claims 3 and 18, Howard teaches. The system of claim 1, wherein the content comprises multi-modal content and wherein to recognize the conflict to be corrected, the processor is programmed to:
identify an inconsistency between a first visual in the multi-modal content and a second visual in the multi-modal content (par. 48 beholder input of visual content “A beholder can also be creating, using, or consuming content (such as a document, article, photo, or video) in another application and can indicate a subject matter prompt using an indication motif provided across the entire user experience device by the beholder request component 105.” Par. 49 visual content can be sent by user experience device (UED) to the virtual experience service to create a virtual experience container (object) content can be text, audio, photo, video etc…).

As for claim 4, Howard teaches. The system of claim 1, wherein the content comprises a prompt for input to a language model, and wherein to recognize the conflict, the processor is programmed to: identify an inconsistency between one or more first words in the prompt and one or more second words in the prompt (par. 57, 156-158; searching (using queries or full text) may be supplemented with advanced search techniques and operators, such as fuzzy search, proximity search, word stemming, synonym databases, and acronym databases; in some cases, therefore, a primary concept facet being “aligned with” the target primary concept may not mean “strictly identical to”. Supplement, refine, and/or reduce cache set A 413 using cache set B 415, forming cache set C 417. Thus various inconsistencies can be found within subject matter prompt to create a new cache set of results to return to UED. A virtual experience service 120 (e.g., using context analysis component 121 or other subcomponents) may interact with or direct requests to content interpretation service(s) 130 to assist in the identification of concepts in various kinds of content, including subject matter prompts, beholder-provided content, content repository media, and information feeds. Content interpretation service(s) 130 can be used to, for example: identify the grammatical or semantic structure of text, discern key concepts in text, translate text, and identify entities in text; classify objects or places or identify people in images, caption images, perform video and speech interpretation to elicit concepts, identify a speaker, translate speech, identify and track faces, and index content; and analyze the “sentiment” expressed by speech, text, or images).

As for claim 5, Howard teaches. The system of claim 4, wherein to generate a corrective action, the processor is programmed to: generate a recommendation to modify the prompt based on recognized conflict (par. 65 query formulation can change inconsistencies with a query submitted by beholder to communicate with repositories more effectively.  Transforming the subject matter context and other content into query terms and formulating queries from the query terms are described more fully with respect to the example process flow in FIG. 2A. It should also be noted that content and information searching capabilities of the query formulation and search module(s) 122 may be employed incrementally, iteratively, in multiple stages, synchronously and asynchronously, and by multiple sub-components of the virtual experience service 120 (e.g., by context analysis 121, content element deconstruction 123, and experiential synthesis 125 components).

As for claim 6, Howard teaches. The system of claim 4, wherein to generate a corrective action, the processor is programmed to: modify the prompt based on recognized conflict (par. 64 query formulation modifies the beholders query to make it more effective for the repositories it is communicating with).

As for claim 7, Howard teaches. The system of claim 1, wherein the content comprises a prompt for input to a language model, and wherein to recognize the conflict, the processor is programmed to: identify an inconsistency between one or more words in the prompt and one or more words in a previous prompt (par. 158Cache set A and cache set B may be processed together using set arithmetic, such as unions, joins, and intersections. Searches and filtering (including Boolean logic) may be applied to either cache set A or B to select DCEs for cache set C. For example, DCEs for cache set C may be selected by taking the union of cache set A and cache set B to form cache set C; DCEs for cache set C may be selected by taking the union of cache set A and B and then removing the DCEs that do not share a common content reference pointer, effectively meaning that a particular content segment must contain both a primary and supplemental concepts to be included in cache set C. par. 73-75 a virtual experience system/service needs to transform the structure of the query terms into a query language form suitable for functioning with a particular database service or to a form compatible with the particular API of a targeted element source).

As for claim 8, Howard teaches. The system of claim 7, wherein to generate a corrective action, the processor is programmed to: generate a recommendation to modify the prompt based on recognized conflict (par. 73-75 a virtual experience system/service needs to transform the structure of the query terms into a query language form suitable for functioning with a particular database service or to a form compatible with the particular API of a targeted element source).

As for claim 9, Howard teaches. The system of claim 7, wherein to generate a corrective action, the processor is programmed to: modify the prompt based on recognized conflict (par. 73-75 a virtual experience system/service needs to transform the structure of the query terms into a query language form suitable for functioning with a particular database service or to a form compatible with the particular API of a targeted element source).

As for claim 10, Howard teaches. The system of claim 1, wherein to recognize the conflict to be corrected, the processor is programmed to: identify and store an object in the content; and determine a difference between a first version of the object and a second version of the object, wherein the difference indicates the conflict to be corrected (par. 65  virtual experience service 120 performs deconstruction of the obtained content into content elements (e.g., via 123) and re-classification, synthesis, and (in some cases) dimension-wise or facet-wise expansion of the content elements (e.g., via 125 and 126) into a unified virtual experience container 150 tuned to the capabilities of the user experience device 100. This can be seen in figure 4B with cache sets A and B used to create cache set C as described in analysis of claim 1 above).

As for claim 11, Howard teaches. The system of claim 1, wherein to recognize the conflict to be corrected, the processor is programmed to: identify and store an object in the content; and generate a first embedding for the object at a first time; generate a second embedding for the object at a second time; and determine a difference between the first embedding and the second embedding, wherein the difference indicates the conflict to be corrected (par. 87 The content reference pointer can reference the physical or logical location of a single content file, a set of multiple content files, a compound content file, a content file that embeds the desired content, a specific range in a content file (e.g., the range of video in a file from time location 1:00:12.36 to 1:01:18:04), and/or a location in a database, data stream, or other data structure).

As for claim 12, Howard teaches The system of claim 10, wherein to identify and store the object in the content, the processor is programmed to: store the object in a graph database that includes one or more other objects in a timeline of the content (par. 88 A schema of element facets contains a canonical structural organization for how discrete content elements are classified so that they can be used to construct the facet-segmented repository; par. 74 virtual experience system can sometimes visualize its data as a graph using a graph API).

As for claim 13, Howard teaches. The system of claim 1, wherein to identify a property of the content that should be changed based on the recognized conflict, the processor is further programmed to: determine a first mood of a first portion of the content; determine a second mood of a second portion of the content; determine that the first mood and the second mood conflict with one another; wherein to generate the corrective action, the processor is programmed to: modify the first portion and/or the second portion so that the first mood is consistent with the second mood 
 (par. 129-132 Sentiment analysis of content is used to determine an emotion (mood) of content to build sentiment facets to data in order to select and deselect content elements for virtual experience container this is paired with previous teachings as another layer into collecting and modifying content based upon SMP inputted; par. 48 beholder input of visual content “A beholder can also be creating, using, or consuming content (such as a document, article, photo, or video) in another application and can indicate a subject matter prompt using an indication motif provided across the entire user experience device by the beholder request component 105.” Par. 49 visual content can be sent by user experience device (UED) to the virtual experience service to create a virtual experience container (object) content can be text, audio, photo, video etc…)..

As for claim 14, Howard teaches. The system of claim 13, wherein the first portion comprises first text and the second portion comprises second text, and wherein the processor is further programmed to: determine the first mood based on the first text; and determine the second mood based on the second text (par. 129-132 Sentiment analysis of content is used to determine an emotion (mood) of content to build sentiment facets to data in order to select and deselect content elements for virtual experience container this is paired with previous teachings as another layer into collecting and modifying content based upon SMP inputted; par. 46 the SMP is too broad and fixes the prompt from beholder; a subject matter prompt may be developed iteratively or interactively with the beholder, for example, to narrow down an open-ended prompt or suggest options to a beholder. For instance, the beholder request component 105 may, at the direction of the virtual experience service 120, present additional options on the UED 100 to the beholder 101 after an initial beholder request has been formulated. For example, if the prompt “Dad as a soldier” elicits too much content, the beholder request component 105 might render a prompt asking the beholder “How about Dad's experiences during the Korean War?”).

As for claim 15, Howard teaches. The system of claim 13, wherein the first portion comprises text and the second portion comprises an image, and wherein the processor is further programmed to: determine the first mood based on the text; and determine the second mood based on the image (par. 129-132 Sentiment analysis of content is used to determine an emotion (mood) of content to build sentiment facets to data in order to select and deselect content elements for virtual experience container this is paired with previous teachings as another layer into collecting and modifying content based upon SMP inputted; par. 135 For example, if Mom (from the immediately prior example) also posted, along with the photo, “Y′all KNOW I just LOVE cheeseburgers!” the surface meaning of this text might be interpreted as expressing a positive sentiment toward the concept “cheeseburgers,” though the impression given by the photo and the capitalization pattern suggests otherwise. To ameliorate the problems associated with sarcasm when performing sentiment facetization, some implementations may use a trained AI classifier that detects sarcasm in text (e.g., TrueRatr from Cornell University). In some implementations, expressions of contrasting sentiment, such those as obtained from social network metadata or facial expression analysis, may be used to signal sarcasm in text content. A result of sarcasm detection via one or more such techniques may be that the “sarcastic” contrary text indicators are disregarded altogether or counted as an additional instance of the opposite sentiment toward the content element.).
(Note:) 	It is noted that any citation to specific, pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. In re Heck, 699 F.2d 1331, 1332-33, 216 USPQ 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006,1009, 158 USPQ 275, 277 (CCPA 1968)).

Response to Arguments
Applicant’s arguments with respect to claim(s) 1-21 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 

Inquires

Any inquiry concerning this communication should be directed to NICHOLAS AUGUSTINE at telephone number (571)270-1056.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

    PNG
    media_image1.png
    208
    559
    media_image1.png
    Greyscale

/NICHOLAS AUGUSTINE/Primary Examiner, Art Unit 2178                                                                                                                                                                                                        February 20, 2026

Read full office action

Prosecution Timeline

Mar 06, 2025

Application Filed

Jun 20, 2025

Non-Final Rejection — §103

Sep 24, 2025

Response Filed

Oct 08, 2025

Final Rejection — §103

Jan 12, 2026

Request for Continued Examination

Jan 23, 2026

Response after Non-Final Action

Feb 20, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/365,371

Patent 12598212

Cybersecurity Risk Analysis and Modeling of Risk Data on an Interactive Display

2y 5m to grant Granted Apr 07, 2026

18/527,906

Patent 12584752

VISUAL VEHICLE-POSITIONING FUSION SYSTEM AND METHOD THEREOF

2y 5m to grant Granted Mar 24, 2026

18/567,013

Patent 12586264

WORD EVALUATION VALUE ACQUISITION METHOD, APPARATUS AND PROGRAM

2y 5m to grant Granted Mar 24, 2026

18/241,683

Patent 12578836

USER INTERFACE FOR INTERACTING WITH AN AFFORDANCE IN AN ENVIRONMENT

2y 5m to grant Granted Mar 17, 2026

18/369,557

Patent 12580920

SYSTEM AND METHOD FOR FACILITATING USER INTERACTION WITH A SIMULATED OBJECT ASSOCIATED WITH A PHYSICAL LOCATION

2y 5m to grant Granted Mar 17, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

73%

Grant Probability

99%

With Interview (+27.8%)

3y 9m

Median Time to Grant

High

PTA Risk

Based on 814 resolved cases by this examiner. Grant probability derived from career allow rate.