Last updated: April 19, 2026
Application No. 18/472,941
PROCESSING SENSOR DATA USING LANGUAGE MODELS IN MAP GENERATION SYSTEMS AND APPLICATIONS

Non-Final OA §101§103§DP
Filed
Sep 22, 2023
Examiner
YAMAMOTO, JOSEPH JEREMY
Art Unit
2656
Tech Center
2600 — Communications
Assignee
Nvidia Corporation
OA Round
3 (Non-Final)
Interview Optional

— +21.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 43 resolved cases, 2023–2026
Examiner Intelligence

YAMAMOTO, JOSEPH JEREMY View full profile →
Grants 72% — above average
Career Allow Rate
31 granted / 43 resolved
+10.1% vs TC avg
Strong +21% interview lift
Without
With
+21.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
17 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
23.1%
-16.9% vs TC avg
§103
47.6%
+7.6% vs TC avg
§102
8.2%
-31.8% vs TC avg
§112
19.7%
-20.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 43 resolved cases
Office Action

§101 §103 §DP
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 6 Mar 2026 has been entered.
 
DETAILED ACTION
Claims 1-20 are pending. Claims 1, 11, and 16 are independent.  
Claims 2-10 depend from Claim 1.
Claims 12-15 depend from Claim 11.
Claims 17-20 depend from Claim 16.
This Application was published as U.S. 2024/0419903.

Response to Amendment
	Examiner thanks Applicant for response filed on 6 Mar 2026 which has been correspondingly accepted and considered in this office action. Claims 1-20 are pending.

Response to Arguments
Applicant's arguments filed 26 Mar 2026 pages 8-14 have been fully considered but they are not persuasive. Each argument of Applicant’s arguments will be addressed in turn.

With regards to Claim objections:
Applicant's has amended claim 11. Amendments have been considered and are accepted. Objection to claim 11 have been withdrawn.

With regards to Provisional Double Patenting:
Applicant's remarks of 6 Mar 2026 have been considered, and the rejection will be maintained until the filing of a terminal disclaimer.

With regards to 35 USC § 103:
Applicant's arguments filed 6 Mar 2026 have been fully considered but they are not persuasive. Applicant arguments will be addressed in turn.

Claim 1:
Applicant argues the following on pages 10 of 6 Mar 2026 Applicant arguments:
	With respect to Li, Applicant submits that paragraph [0026] of Li is limited to 			describing combining sensor-derived information into an NLP prompt. The Li's 			architecture proceeds from sensor input to prompt generation, followed by 			tokenization of the prompt and optional mapping of tokens to embeddings 				representing semantic information about words in the prompt. Applicant submits 			that the tokens and embeddings in Li correspond to linguistic elements of the 			prompt. For example, Li is silence on describing extracting structured 				environmental features from sensor observations, encoding such structured 			environmental features into feature vectors, or forming a tokenized representation 		composed of feature vectors representing non-linguistic attributes of a physical 			environment. Applicant submits that Li's disclosure of combining sensor inputs 			into an NLP prompt does not equate to encoding environmental attributes into 			feature vectors that constitute the tokenized representation. Generating a prompt 			informed by sensor data is architecturally distinct from extracting structured 			environmental properties and encoding them into feature vectors representing 			physical structure.

Applicant arguments are not persuasive for the primary reasons that the arguments narrowly interpret the prior art.
MPEP 2123(I) states that patents are relevant as prior art for all they contain and “A reference may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art, including nonpreferred embodiments.” Merck & Co. v. Biocraft Labs., Inc. 874 F.2d 804, 10 USPQ2d 1843 (Fed. Cir. 1989), cert. denied, 493 U.S. 975 (1989).
Here, Applicant describes above an overly narrow interpretation of Li, by stating “Applicant submits that paragraph [0026] of Li is limited to describing combining sensor-derived information into an NLP prompt.” (Applicant arguments page 10) On the other hand, Li states “the virtual assistant may combine input data from different types of sensors (such as cameras and microphones) into the same NLP prompt” (Par [0026]) where data from a camera includes visible attributes of the environment which is not limited to linguistic elements as exemplified in Li Fig 3 which shows how image data (303) from a sensor is used to generate a prompt for a NLP interface, which includes non-linguistic elements such as image data. Thus, Li teaches non-linguistic attributes of the environment.

	Applicant argues the following on pages 10 of 6 Mar 2026 Applicant arguments:
	In contrast, Li's language model generates conversational responses or service-			oriented outputs based on tokenized prompts. For example, as described in Li 			[0031], tokens correspond to words in a prompt, and embeddings represent 			semantic information about those words. Li's inference concerns linguistic 			interpretation of user input or environmental triggers for conversational 				response generation. Li does not disclose generating a tokenized description 			representing an inferred structural relationship among spatial, geometric, 				topological, or kinematic environmental properties encoded as feature vectors, 			nor does Li disclose using such inferred structural relationships between two or 			more of the features to generate a map. 

Applicant arguments are not persuasive for two primary reasons: arguments narrowly interpret the prior art, and claims are broadly stated.
Here, Applicant describes above an overly narrow interpretation of Li, by stating “For example, as described in Li [0031], tokens correspond to words in a prompt, and embeddings represent semantic information about those words” (Applicant arguments page 11) On the other hand, Li states Virtual Assistant (VA) “system 200 may be configured to support multiple modalities ... In some implementations, media content captured via one or more sensors (such as images, video, or audio) may be passed directly to the NLP.” (Par [0051]) where data from a camera includes image data that is not limited to words in a prompt.
Further, MPEP 2111 requires claims must be given their broadest reasonable interpretation in light of the specification. Under the requirement for plain meaning, MPEP 2111.01 (II) states it is improper to import claim limitations from the specification.
Here, Li Figures 2-3 teach VA system represents multiple modalities that include image and audio data, where image data represents the visible inferred structural relationships between the topological, geometric, or relational information of the features because images visible show these relationships. Under the broadest reasonable interpretation, an image is a visible representation of the structures in the image which are an inferred structural relationships, and Li creates tokens or prompts (Li Par [0004]) which represent the tokenized representation of the set of observations.
Thus for all the reasons presented above, Applicant arguments for 35 USC § 103 rejection of claim 1 is not persuasive. 

Claims 11 and 16:
Applicant argues on page 12 of 6 Mar 2026 Applicant arguments that are similar to the argument in claim 1. For brevity, the reasoning is similar to claim 1, and for all the reasons presented above, Applicant arguments for 35 USC § 103 rejections of claims 11 and 16 are not persuasive.

Dependent Claims 2-6, 7-10, 12-15, and 17-20:
Applicant argues on page 13 of 6 Mar 2026 Applicant arguments that dependent claims are allowable since the independent claims are allowable. For all the reasons presented above, Applicant arguments for 35 USC § 103 rejections of dependent claims 2-6, 7-10, 12-15, and 17-20 are not persuasive.

Thus for all the reasons presented above for claims 1-20, Applicant arguments for 35 USC § 103 rejections are not persuasive. 

Information Disclosure Statement
The information disclosure statement filed 6 Mar 2026 fails to comply with the provisions of 37 CFR 1.97(a) because it lacks the appropriate size fee set forth in 37 CFR 1.17(v). It has been placed in the application file, but the information referred to therein has not been considered as to the merits.
Per 37 CFR 1.17(v) Applicant is required to pay a fee if the cumulative number of IDS submitted exceed 50. It is the applicant’s and patent owner’s responsibility to track the cumulative number of items of information provided in the application. A request for continued examination (RCE) is not the filing of a new application, and thus the count will not reset when an RCE is filed. Here, table below shows IDS submitted exceeds 50:
IDS submitted date
References types
Totals
16 Oct 2025
4 patents, 27 pubs, 9 NPL
40
6 Mar 2026
4 patents, 19 pubs, 7, NPL
30

Total IDS references
70



Provisional Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1, 11 and 16 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 40 of co-pending application 18/365966 in view of reference Dintenfass (US2018/0158157) Although the claims at issue are not identical, they are not patentably distinct from each other because the claims of the issued patent are narrower in scope than that of the instant application, and capturing data with one or more sensors is obvious. The motivation to combine Dintenfass with 18/365966 because Dintenfass teaches an augmented reality system that uses GPS to provide location of the user and authenticates the user based on a user input (claim 1) which increases the capabilities of the invention of 18/365966 to provide location environmental and verifies the authenticity of the user.

Instant Application: 18/472941
Co-pending Application: 18/365966
Claim 1 A method, comprising:  





generating, based at least on a set of observations captured using one or more sensors, a tokenized representation of the set of observations for at least a portion of an environment; the tokenized representation comprising at least one feature vector generated by encoding features, including non-linguistic attributes, as extracted from the set of observations;

generating, based at least on a language model processing the tokenized representation of the set of observations, 







a tokenized description of at least the portion of the environment, 

the tokenized description representing an inferred structural relationship between two or more of the features and
determined based on at least one of semantic, topological, geometric, kinematic, or relational information of the features represented in the tokenized representation of the set of observations; and

generating a map for at least the portion of the environment using the tokenized description. 

Claim 40: A processor, comprising: one or more circuits to 

generate an augmented map corresponding to an environment 

based at least on processing data associated with a map 
Dintenfass Claim 1 (a global position system (GPS) sensor configured to provide the geographic location of the user; generate a property token comprising: the user identifier, user history data for the user, and the location identifier) 



using a trained language model and receiving, as output of the trained language model, 
Dintenfass Claim 1 (generate a property token comprising: the user identifier, user history data for the user, and the location identifier) 



a tokenized description of the environment, 


determined based at least on spatial and semantic relationships between observations generated from the map.







(see beginning of claim 40 above)




Instant Application: 18/472941
Co-pending Application: 18/365966
Claim 11. A processor, comprising: one or more circuits to: 





generate, based at least on a large language model (LLM) processing sensor data  obtained using one or more sensors, a tokenized description of at least the portion of an  environment, 









the tokenized description representing an inferred structural relationship between features of at least the portion of the environment and
determined based on at least one of semantic, topological, geometric, kinematic, or relational information of the features represented in the sensor data; and

generate, based at least on the tokenized description, a map for at least the portion of the environment, the map comprising a machine-readable representation configured for execution by a simulation system.  





Claim 40: A processor, comprising: one or more circuits to 

generate an augmented map corresponding to an environment 


based at least on processing data associated with a map using a trained language model and receiving, as output of the trained language model, 
Dintenfass Claim 1 (a global position system (GPS) sensor configured to provide the geographic location of the user; generate a property token comprising: the user identifier, user history data for the user, and the location identifier) 




a tokenized description of the environment, 
determined based at least on spatial and semantic relationships between observations generated from the map.






(see beginning of claim 40 above)



Instant Application: 18/472941
Co-pending Application: 18/365966
Claim 16. A system, comprising: 
one or more processors 


to generate a map of an environment 



based at least on a tokenized description of at least a portion of the environment, the tokenized description representing an inferred structural relationship between features of at least the portion of the environment and
generated based at least on a language model processing a set of observations of the environment determined using one or more sensors


the generated map comprising a machine-readable representation configured to be executed by a simulation system.

Claim 40: A processor, comprising: one or more circuits to 

generate an augmented map corresponding to an environment 


based at least on processing data associated with a map using a trained language model and receiving, as output of the trained language model, a tokenized description of the environment, 
determined based at least on spatial and semantic relationships between observations generated from the map.
Dintenfass Claim 1 (a global position system (GPS) sensor configured to provide the geographic location of the user; ) 

Dintenfass Claim 1 (one or more processors operably coupled to the display and the GPS sensor)


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more. The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Independent claims 1, 11, and 16 recite various limitations that, but for generic computer components (i.e. one or more sensors, circuits, or processors) can be performed in the human mind or with pen and paper, and are considered abstract ideas, and using a machine learning model can be considered a mathematical calculation. The claims under the broadest reasonable interpretation cover the concept of receiving a uniform resource locator (URL), determining a threshold is satisfied, identifying key points for a webpage, and presenting the key points on the webpage. (See MPEP 2106.04(a)(2) III)

This judicial exception is not integrated into a practical application because the claims only recite elements in the form of “sensors” “processors,” “large learning model,” or “circuits.” These elements are used to perform the claimed methods and steps, and are recited at a high-level of generality such that it amounts no more than mere instructions to apply the exception using a generic computer component and displayed on a webpage. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. 

The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because they do not include subject matter that could not be performed by a human, as discussed above with respect to integration of the abstract idea into a practical application. The additional elements of using the generic computing elements to perform the claimed elements amount to no more than mere instructions to apply the exception using a generic computer component or can be considered insignificant extra solution activity. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept, and mere data gathering in conjunction with an abstract idea cannot provide an inventive concept. For the all the reasons stated above, the claims are not patent eligible.

	With regards to claim 2, the claim further limits the elements of claim 1; however, these limitations do not preclude the limitations from being performed by mental observations or evaluation, or as a data gathering step that a person does in one’s one head as a mental process because how tokenized representation is generated neither a mathematical concept nor a mental process, nor a method of organizing human activity because it does not fall within the enumerated sub-groupings. Similar to claim 1, no additional elements beyond the use of generic computing elements that are well-understood, routine, conventional activities previously known to the industry. Furthermore, no improvement to a technology or technical field is provided in any limitation that satisfies the “model training on large-scale maps-making data driven performance improvements easier and more scalable with respect to domain expertise” … or “Processing efficiency can also be improved by replacing manual labor with machine learning model-based automation.” (Par [0049]) Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception. 

	With regards to claim 3, the claim  further limits the elements of claim 1; however, these limitations do not preclude the limitations from being performed by mental observations or evaluation, or as a data gathering step that a person does in one’s one head as a mental process because using an automated end-to-end process is a data gathering step. Similar to claim 1, no additional elements beyond the use of generic computing elements that are well-understood, routine, conventional activities previously known to the industry. Furthermore, no improvement to a technology or technical field is provided in any limitation that satisfies the “model training on large-scale maps-making data driven performance improvements easier and more scalable with respect to domain expertise” … or “Processing efficiency can also be improved by replacing manual labor with machine learning model-based automation.” (Par [0049]) Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception.
	With regards to claim 4, the claim further limits the elements of claim 1; however, these limitations do not preclude the limitations from being performed by mental observations or evaluation, or as a data gathering step that a person does in one’s one head as a mental process because format of map generated is neither a mathematical concept nor a mental process, nor a method of organizing human activity because it does not fall within the enumerated sub-groupings. Similar to claim 1, no additional elements beyond the use of generic computing elements that are well-understood, routine, conventional activities previously known to the industry. Furthermore, no improvement to a technology or technical field is provided in any limitation that satisfies the “model training on large-scale maps-making data driven performance improvements easier and more scalable with respect to domain expertise” … or “Processing efficiency can also be improved by replacing manual labor with machine learning model-based automation.” (Par [0049]) Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception.

	With regards to claim 5, the claim further limits the elements of claim 1; however, these limitations do not preclude the limitations from being performed by mental observations or evaluation, or as a data gathering step that a person does in one’s one head as a mental process because type of tokenized string representation is neither a mathematical concept nor a mental process, nor a method of organizing human activity because it does not fall within the enumerated sub-groupings. Similar to claim 1, no additional elements beyond the use of generic computing elements that are well-understood, routine, conventional activities previously known to the industry. Furthermore, no improvement to a technology or technical field is provided in any limitation that satisfies the “model training on large-scale maps-making data driven performance improvements easier and more scalable with respect to domain expertise” … or “Processing efficiency can also be improved by replacing manual labor with machine learning model-based automation.” (Par [0049]) Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception.

	With regards to claim 6, the claim further limits the elements of claim 5; however, these limitations do not preclude the limitations from being performed by mental observations or evaluation, or as a data gathering step that a person does in one’s one head as a mental process because language text string is written in is neither a mathematical concept nor a mental process, nor a method of organizing human activity because it does not fall within the enumerated sub-groupings. Similar to claim 5, no additional elements beyond the use of generic computing elements that are well-understood, routine, conventional activities previously known to the industry. Furthermore, no improvement to a technology or technical field is provided in any limitation that satisfies the “model training on large-scale maps-making data driven performance improvements easier and more scalable with respect to domain expertise” … or “Processing efficiency can also be improved by replacing manual labor with machine learning model-based automation.” (Par [0049]) Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception.

	With regards to claim 7, the claim further limits the elements of claim 1; however, these limitations do not preclude the limitations from being performed by mental observations or evaluation, or as a data gathering step that a person does in one’s one head as a mental process because type of observations is neither a mathematical concept nor a mental process, nor a method of organizing human activity because it does not fall within the enumerated sub-groupings. Similar to claim 1, no additional elements beyond the use of generic computing elements that are well-understood, routine, conventional activities previously known to the industry. Furthermore, no improvement to a technology or technical field is provided in any limitation that satisfies the “model training on large-scale maps-making data driven performance improvements easier and more scalable with respect to domain expertise” … or “Processing efficiency can also be improved by replacing manual labor with machine learning model-based automation.” (Par [0049]) Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception.

	With regards to claim 8, the claim further limits the elements of claim 1; however, these limitations do not preclude the limitations from being performed by mental observations or evaluation, or as a data gathering step that a person does in one’s one head as a mental process because how an observation is captured is neither a mathematical concept nor a mental process, nor a method of organizing human activity because it does not fall within the enumerated sub-groupings. Similar to claim 1, no additional elements beyond the use of generic computing elements that are well-understood, routine, conventional activities previously known to the industry. Furthermore, no improvement to a technology or technical field is provided in any limitation that satisfies the “model training on large-scale maps-making data driven performance improvements easier and more scalable with respect to domain expertise” … or “Processing efficiency can also be improved by replacing manual labor with machine learning model-based automation.” (Par [0049]) Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception.

	With regards to claim 9, the claim further limits the elements of claim 1; however, these limitations do not preclude the limitations from being performed by mental observations or evaluation, or as a data gathering step that a person does in one’s one head as a mental process because type of sensor is neither a mathematical concept nor a mental process, nor a method of organizing human activity because it does not fall within the enumerated sub-groupings. Similar to claim 1, no additional elements beyond the use of generic computing elements that are well-understood, routine, conventional activities previously known to the industry. Furthermore, no improvement to a technology or technical field is provided in any limitation that satisfies the “model training on large-scale maps-making data driven performance improvements easier and more scalable with respect to domain expertise” … or “Processing efficiency can also be improved by replacing manual labor with machine learning model-based automation.” (Par [0049]) Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception.

	With regards to claim 10, the claim further limits the elements of claim 1; however, these limitations do not preclude the limitations from being performed by mental observations or evaluation, or as a data gathering step that a person does in one’s one head as a mental process because including an additional feature is neither a mathematical concept nor a mental process, nor a method of organizing human activity because it does not fall within the enumerated sub-groupings. Similar to claim 1, no additional elements beyond the use of generic computing elements that are well-understood, routine, conventional activities previously known to the industry. Furthermore, no improvement to a technology or technical field is provided in any limitation that satisfies the “model training on large-scale maps-making data driven performance improvements easier and more scalable with respect to domain expertise” … or “Processing efficiency can also be improved by replacing manual labor with machine learning model-based automation.” (Par [0049]) Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception.

	With regards to claim 12, the claim further limits the elements of claim 11; however, these limitations do not preclude the limitations from being performed by mental observations or evaluation, or as a data gathering step that a person does in one’s one head as a mental process because how a tokenized representation is generated is neither a mathematical concept nor a mental process, nor a method of organizing human activity because it does not fall within the enumerated sub-groupings. Similar to claim 11, no additional elements beyond the use of generic computing elements that are well-understood, routine, conventional activities previously known to the industry. Furthermore, no improvement to a technology or technical field is provided in any limitation that satisfies the “model training on large-scale maps-making data driven performance improvements easier and more scalable with respect to domain expertise” … or “Processing efficiency can also be improved by replacing manual labor with machine learning model-based automation.” (Par [0049]) Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception.

	With regards to claim 13, the claim further limits the elements of claim 11; however, these limitations do not preclude the limitations from being performed by mental observations or evaluation, or as a data gathering step that a person does in one’s one head as a mental process because type of tokenized text string representation is neither a mathematical concept nor a mental process, nor a method of organizing human activity because it does not fall within the enumerated sub-groupings. Similar to claim 11, no additional elements beyond the use of generic computing elements that are well-understood, routine, conventional activities previously known to the industry. Furthermore, no improvement to a technology or technical field is provided in any limitation that satisfies the “model training on large-scale maps-making data driven performance improvements easier and more scalable with respect to domain expertise” … or “Processing efficiency can also be improved by replacing manual labor with machine learning model-based automation.” (Par [0049]) Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception.

	With regards to claim 14, the claim further limits the elements of claim 11; however, these limitations do not preclude the limitations from being performed by mental observations or evaluation, or as a data gathering step that a person does in one’s one head as a mental process because inclusion of an additional feature is neither a mathematical concept nor a mental process, nor a method of organizing human activity because it does not fall within the enumerated sub-groupings. Similar to claim 11, no additional elements beyond the use of generic computing elements that are well-understood, routine, conventional activities previously known to the industry. Furthermore, no improvement to a technology or technical field is provided in any limitation that satisfies the “model training on large-scale maps-making data driven performance improvements easier and more scalable with respect to domain expertise” … or “Processing efficiency can also be improved by replacing manual labor with machine learning model-based automation.” (Par [0049]) Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception.

	With regards to claim 15, the claim further limits the elements of claim 11; however, these limitations do not preclude the limitations from being performed by mental observations or evaluation, or as a data gathering step that a person does in one’s one head as a mental process because processor being comprised of a system is neither a mathematical concept nor a mental process, nor a method of organizing human activity because it does not fall within the enumerated sub-groupings. Similar to claim 11, no additional elements beyond the use of generic computing elements that are well-understood, routine, conventional activities previously known to the industry. Furthermore, no improvement to a technology or technical field is provided in any limitation that satisfies the “model training on large-scale maps-making data driven performance improvements easier and more scalable with respect to domain expertise” … or “Processing efficiency can also be improved by replacing manual labor with machine learning model-based automation.” (Par [0049]) Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception.

	With regards to claim 17, the claim further limits the elements of claim 16; however, these limitations do not preclude the limitations from being performed by mental observations or evaluation, or as a data gathering step that a person does in one’s one head as a mental process because a language used to process a tokenized representation is neither a mathematical concept nor a mental process, nor a method of organizing human activity because it does not fall within the enumerated sub-groupings. Similar to claim 16, no additional elements beyond the use of generic computing elements that are well-understood, routine, conventional activities previously known to the industry. Furthermore, no improvement to a technology or technical field is provided in any limitation that satisfies the “model training on large-scale maps-making data driven performance improvements easier and more scalable with respect to domain expertise” … or “Processing efficiency can also be improved by replacing manual labor with machine learning model-based automation.” (Par [0049]) Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception.

Claim 18 is a system claim with limitations corresponding to the limitations of method Claim 13 and is rejected under similar rationale. 

Claim 19 is a system claim with limitations corresponding to the limitations of method Claim 14 and is rejected under similar rationale. 

Claim 20 is a system claim with limitations corresponding to the limitations of method Claim 15 and is rejected under similar rationale. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-5, and 7-20 are rejected under 35 U.S.C. 103 as being unpatentable over Li et al.(US2024/0395261) hereinafter Li in view of Dintenfass (US2018/0158157) hereinafter Dintenfass.

	With regards to claim 1, Li teaches:
A method, comprising:  
generating, based at least on a set of observations captured using one or more sensors, [Li Fig 1 item 112 Par [0030] teaches “sensors 112 may include any suitable sensor technology that can be used to detect changes in a surrounding environment”]
a tokenized representation of the set of observations for at least a portion of an environment; [Li Par [0031] teaches “prompt creation component 114 is configured to generate a prompt 103 based on environmental changes detected by the sensors 112. For example, the prompt 103 may include a sequence of tokens that are suitable for processing by the NLP 120”] 
the tokenized representation comprising at least one feature vector generated by encoding features, including non-linguistic attributes, as extracted from the set of observations; [Li Fig 1 teaches the prompt includes a “sequence of tokens” … [which] “ may be further mapped to a sequence of vectors (also referred to as “embeddings”) which provide semantic information about the words represented in the prompt 103.” (Par [0031]) where the “virtual assistant may combine input data from different types of sensors (such as cameras and microphones) into the same NLP prompt.” (Par [0026]) where data from a camera includes visible attributes of the environment which are non-linguistic attributes as exemplified in Li Fig 3 which shows how image data (303) from a sensor is used to generate a prompt for a NLP interface, which includes non-linguistic elements such as image data]
generating, based at least on a language model processing the tokenized representation of the set of observations, [Li Par [0031] teaches “prompt creation component 114 may encode the text into a sequence of tokens that can be processed by the NLP 120”]
a tokenized description of at least the portion of the environment, the tokenized description representing an inferred structural relationship between two or more of the features and determined based on at least one of semantic, topological, geometric, kinematic, or relational information of the features represented in the tokenized representation of the set of observations; and [Li teaches “the tokens may be further mapped to a sequence of vectors (also referred to as “embeddings”) which provide semantic information about the words represented in the prompt 103.” (Par [0031]) Li Figures 2-3 teach VA system represents multiple modalities that include image and audio data, where image data represents the visible inferred structural relationships between the topological, geometric, or relational information of the features because images visible show these relationships, and Li creates tokens or prompts (Li Par [0004]) which represent the tokenized representation of the set of observations]

With regards to claim 1, Li fails to teach:
generating a map for at least the portion of the environment using the tokenized description. 

With regards to claim 1, Dintenfass teaches:
generating a map for at least the portion of the environment using the tokenized description. [Dintenfass Fig 7 item 714 teaches “augmented reality user device 200 generates a map based on neighborhood information provided by the virtual assessment data 111” (Par [0162]) where at “step 712, the augmented reality user device 200 receives virtual assessment data 111 from the remote server 102 in response to sending the property token 110 to the remote server 102.” (Par [0161])
	It would be obvious to one of ordinary skill in the art at time of applicant’s effective filing date to combine the virtual assistant taught by Li with the augmented reality device of Dintenfass. The motivation to combine the inventions of Li and Dintenfass is because Dintenfass teaches “augmented reality user device 200 uses geographic location information provided by a GPS sensor with a map database to determine the location of the user 106” (Par [0042]) which increases the capabilities of the invention of Li to interact with user using user location data]

	With regards to claim 2, Li in view of Dintenfass teaches:
	All the limitations of claim 1
wherein the tokenized representation of the set of observations is generated using the language model, a second language model, or a data encoder. [Li teaches “prompt 103 may include a sequence of tokens that are suitable for processing by the NLP 120” (Par [0031])  where NLP is a natural language processing model]

	With regards to claim 3, Li in view of Dintenfass teaches:
	All the limitations of claim 1
wherein the map is generated using the tokenized description corresponding to the set of observations using an automated end-to-end process. [Dintenfass Fig 7 teaches end-to-end process for steps 706-714]

With regards to claim 4, Li in view of Dintenfass teaches:
	All the limitations of claim 1
wherein the map generated using the tokenized description includes one or more maps, or sets of map data, in one or more of a set of map formats. [Dintenfass Par [0162] teaches “augmented reality user device 200 generates a two-dimensional or a three-dimensional map that overlays the neighborhood information with a geographical map.”]

With regards to claim 5, Li in view of Dintenfass teaches:
	All the limitations of claim 1
wherein the tokenized description is a tokenized text string representative of at least the portion of the environment, the tokenized text string including a sequence of tokens associated with objects in the environment. [Li Par [0031] teaches “tokens may be further mapped to a sequence of vectors (also referred to as “embeddings”) which provide semantic information about the words represented in the prompt 103” where semantic information is a feature of the environment]

With regards to claim 7, Li in view of Dintenfass teaches:
	All the limitations of claim 1
wherein set of observations further includes at least one of  lighting data, weather data, human annotations, prior map data, or other data relevant to use cases considered. [Li Par [0030 teaches “suitable sensor technologies may include audio and visual sensor technologies, among other examples. In some implementations, the sensors 112 may include one or more microphones that are configured to detect sounds (such as the user speech 102) propagating through the environment” where visual sensors like cameras includes lighting data, and audio sensors like microphones includes speech or human annotations]

With regards to claim 8, Li in view of Dintenfass teaches:
	All the limitations of claim 1
wherein at least a subset of observations is captured using  one or more sensors on a machine positioned in, or moving through, the portion of the  environment. [Li Fig 1 Par [0029] teaches “environment 100 includes a user 101, a virtual assistant 110, and a natural language processor (NLP) 120” where “virtual assistant 110 includes one or more sensors 112” (Par [0030]) where the user and sensors are positioned in, or moving through, the portion of the environment where the user is located]

With regards to claim 9, Li in view of Dintenfass teaches:
	All the limitations of claim 1
wherein the sensors include at least one of camera sensors, radar sensors, LiDAR sensors, ultrasonic sensors, or depth sensors. [Li Par [0082] teaches “sensors 502 may include one or more depth sensing technologies that can be used to determine the distance of Person X. Example suitable depth sensing technologies include radio detection and ranging (RADAR) and light detection and ranging (LiDAR), among other examples. In some other implementations, the subsystem controller 310 may estimate the distance of Person X based on images captured by a monocular camera”]
With regards to claim 10, Li in view of Dintenfass teaches:
	All the limitations of claim 1
wherein the tokenized description of at least the portion of the environment generated by the language model includes at least one additional feature, corrected feature, or enhanced feature with respect to the features contained in the tokenized representation of the set of observations. [Li teaches “input sources 210 may include any suitable sensor technology that can be used to detect changes in a surrounding environment (such as cameras, motion sensors, or microphones). Thus, the input data 201 may include sensor data indicating the changes detected by the sensor” (Par [0041]) where motion sensors measure change in kinematics which is at least one additional feature]

With regards to claim 11, Li teaches:
A processor, comprising: one or more circuits to: [Li Par [0022] teaches “various illustrative logical blocks, modules, circuits and instructions described in connection with the embodiments disclosed herein may be executed by one or more processors (or a processing system”]
generate, based at least on a large language model (LLM) processing sensor data  obtained using one or more sensors, (Li Fig 1 item 112 sensors) a tokenized description of at least the portion of an  environment, 
the tokenized description representing an inferred structural relationship between features of at least the portion of the environment and determined based on at least one of semantic, topological, geometric, kinematic, or relational information of the features represented in the sensor data; and [Li teaches “the tokens may be further mapped to a sequence of vectors (also referred to as “embeddings”) which provide semantic information about the words represented in the prompt 103.” (Par [0031]) Li Figures 2-3 teach VA system represents multiple modalities that include image and audio data, where image data represents the visible inferred structural relationships between the topological, geometric, or relational information of the features because images visible show these relationships, and Li creates tokens or prompts (Li Par [0004]) which represent the tokenized representation of the set of observations]

With regards to claim 11, Li fails to teach:
generate, based at least on the tokenized description, a map for at least the portion of the environment, the map comprising a machine-readable representation configured for execution by a simulation system.  

With regards to claim 11, Dintenfass teaches:
generate, based at least on the tokenized description, a map for at least the portion of the environment. [Dintenfass Fig 7 item 714 teaches “augmented reality user device 200 generates a map based on neighborhood information provided by the virtual assessment data 111” (Par [0162]) where at “step 712, the augmented reality user device 200 receives virtual assessment data 111 from the remote server 102 in response to sending the property token 110 to the remote server 102.” (Par [0161])]
the map comprising a machine-readable representation configured for execution by a simulation system.  [Dintenfass Fig 7 item 714 teaches “augmented reality user device 200” (Par [0162]) is a simulation system, and “Method 700 is employed by the processor 202 of the augmented reality user device 200 to generate property tokens 110 based on the user 106 of the augmented reality user device 200 and the location of the user 106” (Par [0156]) where processors provide machine-readable representations to be performed on the augmented reality simulation system.
	It would be obvious to one of ordinary skill in the art at time of applicant’s effective filing date to combine the virtual assistant taught by Li with the augmented reality device of Dintenfass. The motivation to combine the inventions of Li and Dintenfass is because Dintenfass teaches “augmented reality user device 200 uses geographic location information provided by a GPS sensor with a map database to determine the location of the user 106” (Par [0042]) which increases the capabilities of the invention of Li to interact with user using user location data]

With regards to claim 12, Li in view of Dintenfass teaches:
	All the limitations of claim 11
wherein the tokenized description of at least the  portion of the environment is generated using a tokenized representation of the sensor data. [Li Par [0031] teaches “prompt creation component 114 is configured to generate a prompt 103 based on environmental changes detected by the sensors 112. For example, the prompt 103 may include a sequence of tokens that are suitable for processing by the NLP 120”]

With regards to claim 13, Li in view of Dintenfass teaches:
	All the limitations of claim 11
wherein the tokenized description is a tokenized text string representative of at least the portion of the environment, the tokenized text string including a sequence of tokens associated with one or more objects or the features in the portion of the environment. [Li Par [0031] teaches “tokens may be further mapped to a sequence of vectors (also referred to as “embeddings”) which provide semantic information about the words represented in the prompt 103” where semantic information are features in the portion of the environment]

With regards to claim 14, Li in view of Dintenfass teaches:
	All the limitations of claim 11
wherein the tokenized description of at least the portion of the environment includes at least one additional feature, corrected feature, or enhanced feature with respect to the features contained in the tokenized representation. [Li teaches “input sources 210 may include any suitable sensor technology that can be used to detect changes in a surrounding environment (such as cameras, motion sensors, or microphones). Thus, the input data 201 may include sensor data indicating the changes detected by the sensor” (Par [0041]) where motion sensors measure change in kinematics which is at least one additional feature]

With regards to claim 15, Li in view of Dintenfass teaches:
	All the limitations of claim 11
wherein the simulation system comprises at least one of: 
a system for performing simulation operations; 
a system for performing simulation operations to test or validate autonomous machine applications; 
a system for performing digital twin operations;
 a system for performing light transport simulation; 
a system for rendering graphical output; 
a system for performing deep learning operations; [Li teaches deep learning operations such as “suitable neural networks for natural language processing include recurrent neural networks (RNNs), long short-term memory (LSTM) networks, and transformers, among other example”(Par [0033])]
a system for performing generative AI operations using a large language model (LLM); [Li teaches “natural language processing models (also referred to as large language models or “LLMs”)” (Par[0024])] 
a system implemented using an edge device; 
a system for generating or presenting virtual reality (VR) content; 
a system for generating or presenting augmented reality (AR) content;  [Dintenfass Fig 1 teaches “augmented reality system 100”]
a system for generating or presenting mixed reality (MR) content; 
a system incorporating one or more Virtual Machines (VMs); 
a system implemented at least partially in a data center; 
a system for performing hardware testing using simulation; 
a system for performing generative operations using a language model (LM); 
a system for synthetic data generation; 
a collaborative content creation platform for 3D assets; or 
a system implemented at least partially using cloud computing resources 

	With regards to claim 16, Li teaches:
A system, comprising: one or more processors [Li Par [0022] teaches “various illustrative logical blocks, modules, circuits and instructions described in connection with the embodiments disclosed herein may be executed by one or more processors (or a processing system”]
based at least on a tokenized description of at least a portion of the environment, the tokenized description representing an inferred structural relationship between features of at least the portion of the environment and generated based at least on a language model processing a set of observations of the environment determined using one or more sensors. [Li Par [0031] teaches “prompt creation component 114 is configured to generate a prompt 103 based on environmental changes detected by the sensors 112. For example, the prompt 103 may include a sequence of tokens that are suitable for processing by the NLP 120”. [Li teaches “the tokens may be further mapped to a sequence of vectors (also referred to as “embeddings”) which provide semantic information about the words represented in the prompt 103.” (Par [0031]) Li Figures 2-3 teach VA system represents multiple modalities that include image and audio data, where image data represents the visible inferred structural relationships between the topological, geometric, or relational information of the features because images visible show these relationships, and Li creates tokens or prompts (Li Par [0004]) which represent the tokenized representation of the set of observations]


With regards to claim 16, Li fails to teach:
to generate a map of an environment
the generated map comprising a machine-readable representation configured to be executed by a simulation system.

With regards to claim 16, Dintenfass teaches:
to generate a map of an environment [Dintenfass Fig 7 item 714 teaches “augmented reality user device 200 generates a map based on neighborhood information provided by the virtual assessment data 111” (Par [0162]) where at “step 712, the augmented reality user device 200 receives virtual assessment data 111 from the remote server 102 in response to sending the property token 110 to the remote server 102.” (Par [0161])
the generated map comprising a machine-readable representation configured to be executed by a simulation system. [Dintenfass Fig 7 item 714 teaches “augmented reality user device 200” (Par [0162]) is a simulation system, and “Method 700 is employed by the processor 202 of the augmented reality user device 200 to generate property tokens 110 based on the user 106 of the augmented reality user device 200 and the location of the user 106” (Par [0156]) where processors provide machine-readable representations to be performed on the augmented reality simulation system.
It would be obvious to one of ordinary skill in the art at time of applicant’s effective filing date to combine the virtual assistant taught by Li with the augmented reality device of Dintenfass. The motivation to combine the inventions of Li and Dintenfass is because Dintenfass teaches “augmented reality user device 200 uses geographic location information provided by a GPS sensor with a map database to determine the location of the user 106” (Par [0042]) which increases the capabilities of the invention of Li to interact with user using user location data]

With regards to claim 17, Li in view of Dintenfass teaches:
	All the limitations of claim 16
wherein the language model is configured to process a tokenized representation of the set of observations. [Li Par [0031] teaches “prompt creation component 114 is configured to generate a prompt 103 based on environmental changes detected by the sensors 112. For example, the prompt 103 may include a sequence of tokens that are suitable for processing by the NLP 120”]

Claim 18 is a system claim with limitations corresponding to the limitations of method Claim 13 and is rejected under similar rationale. 
Claim 19 is a system claim with limitations corresponding to the limitations of method Claim 14 and is rejected under similar rationale. 
Claim 20 is a system claim with limitations corresponding to the limitations of method Claim 15 and is rejected under similar rationale. 

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Li et al(US2024/0395261)  in view of Dintenfass (US2018/0158157) in further view of Bouguerra et al.(US2024/0354491) hereinafter Bouguerra.

	With regards to claim 6, Li in view of Dintenfass teaches:
	All the limitations of claim 5

	With regards to claim 6, Li in view of Dintenfass fails to teach:
wherein the tokenized text string is written in a road topology language (RTL) or a domain specific language (DSL). 

With regards to claim 6, Bouguerra teaches:
wherein the tokenized text string is written in a road topology language (RTL) or a domain specific language (DSL). [Bouguerra teaches “digest engine 300 can be configured with one or more artificial intelligence (AI) algorithm(s) and/or machine learning (ML) model(s) and/or large language models (LLM)” (Par [0026]) and “ LLM can be configured to first tokenize text input into a sequence of words … [and]  the digest engine 300 is not limited to utilizing a transformer large language model but can utilize any type of machine learning model(s) such as a zero-shot, domain-specific, fine-tuned, language representation, bidirectional encoder representations from transformers, multimodal, vector representation or any combination thereof.”
It would be obvious to one of ordinary skill in the art at time of applicant’s effective filing date to combine the virtual assistant taught by Li in view of Dintenfass with the digest engine as taught by Bouguerra. The motivation to combine the inventions of Li and Dintenfass with the invention of Bouguerra is because Bouguerra uses an LLM and “Once trained the LLM is capable of generating a prompt to display to a user in the form of an output text, (e.g., automated output” (Par [0026]) which increases the capabilities of the invention of Li in view of Dintenfass to interact with users using prompts]

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Joseph J Yamamoto whose telephone number is (571)272-4020. The examiner can normally be reached M-F 1000-1800 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached at 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

JOSEPH J. YAMAMOTO
Examiner
Art Unit 2656



/ANDREW C FLANDERS/Supervisory Patent Examiner, Art Unit 2655
Read full office action
Prosecution Timeline

Sep 22, 2023
Application Filed
Jul 12, 2025
Non-Final Rejection — §101, §103, §DP
Oct 01, 2025
Applicant Interview (Telephonic)
Oct 01, 2025
Examiner Interview Summary
Oct 16, 2025
Response Filed
Dec 15, 2025
Final Rejection — §101, §103, §DP
Feb 25, 2026
Examiner Interview Summary
Feb 25, 2026
Applicant Interview (Telephonic)
Mar 06, 2026
Request for Continued Examination
Mar 09, 2026
Response after Non-Final Action
Mar 20, 2026
Non-Final Rejection — §101, §103, §DP (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/081,410
Patent 12602546
KEY POINTS EXTRACTION FOR UNIFORM RESOURCE LOCATORS
2y 5m to grant Granted Apr 14, 2026
18/423,836
Patent 12602377
SYSTEMS AND METHODS FOR QUESTION ANSWERING WITH DIVERSE KNOWLEDGE SOURCES
2y 5m to grant Granted Apr 14, 2026
18/388,447
Patent 12592220
DEEPFAKE DETECTION
2y 5m to grant Granted Mar 31, 2026
18/305,896
Patent 12585875
DEVICE AND METHOD FOR PROCESSING TEMPORAL EXPRESSIONS FROM UNSTRUCTURED TEXTS FOR FILLING A KNOWLEDGE DATABASE
2y 5m to grant Granted Mar 24, 2026
18/318,327
Patent 12566888
MULTI-LINGUAL NATURAL LANGUAGE GENERATION
2y 5m to grant Granted Mar 03, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
72%
Grant Probability
93%
With Interview (+21.2%)
3y 0m
Median Time to Grant
High
PTA Risk
Based on 43 resolved cases by this examiner. Grant probability derived from career allow rate.