Last updated: May 29, 2026

Application No. 18/035,849

KNOWLEDGE INJECTION MODEL FOR GENERATIVE COMMONSENSE REASONING

Non-Final OA §103

Filed

May 08, 2023

Priority

Nov 12, 2020 — nonprovisional of PCTCN2020128481

Examiner

RUTTEN, JAMES D

Art Unit

2121

Tech Center

2100 — Computer Architecture & Software

Assignee

Microsoft Technology Licensing, LLC

OA Round

1 (Non-Final)

Interview Optional

— +38.7% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 63% grant rate with +38.7% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 583 resolved cases, 2023–2026

Examiner Intelligence

RUTTEN, JAMES D View full profile →

Grants 63% of resolved cases

Career Allowance Rate

366 granted / 583 resolved

+7.8% vs TC avg

Strong +39% interview lift

Without

With

+38.7%

Interview Lift

resolved cases with interview

Typical timeline

4y 1m

Avg Prosecution

21 currently pending

Career history

609

Total Applications

across all art units

Statute-Specific Performance

§101

1.6%

-38.4% vs TC avg

§103

92.5%

+52.5% vs TC avg

§102

2.1%

-37.9% vs TC avg

§112

3.2%

-36.8% vs TC avg

Black line = Tech Center average estimate • Based on career data from 583 resolved cases

Office Action

§103

DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-15 have been examined.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-3, 5-12 and 14-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent Application Publication 20200311341 by Chaturvedi et al. ("Chaturvedi") in view of U.S. Patent Application Publication 20100049684 by Adriaansen et al. ("Adriaansen")

In regard to claim 1, Chaturvedi discloses:
1. A system comprising: at least one processor; and memory storing instructions that, when executed by the at least one processor, causes the system to perform a set of operations, the set of operations comprising: See Chaturvedi, Fig. 7, depicting a system.
receiving an indication comprising a search query from a computing device; See Chaturvedi ¶ 0036, “The cue phrase can be provided by any entity outside the system, such as a knowledge base or a human user.” ¶ 0041, “… generating new portions of a story based on a story already produced and cued input. … A starting portion (e.g., the first portion supplied initially, e.g., from a human, based on a title, or at random) includes one or more tokens represented in FIG. 1C by token 120a, token 120b, token 120c, ellipsis indicating one or more other tokens, and token 120d.”
obtaining, based on a knowledge corpus, a prototype for a set of concepts associated with the search query; Chaturvedi Fig. 1A, element 110 “Training Corpus.” ¶ 0035, “The documents 111 for the corpus 110 are selected to be appropriate for the type of story to be told.”
encoding an input based on the set of concepts and the obtained prototype, the input comprising one or more concept input tokens for the set of concepts and one or more prototype input tokens for the obtained prototype;  Chaturvedi ¶ 0042, “The artificial intelligence (AI) story generation module 140 produces one or more new column vectors 141, depicted in FIG. 1C as vectors with elements e1 through eT, ellipsis, and f1 through fT, respectively, which are associated with corresponding text tokens 120e through 120f, respectively, with ellipsis indicating zero or more intervening vectors and corresponding tokens. To determine the meaning of the context in matrix 131 and the cue phrase in matrix 151, the artificial intelligence story generation module 140 of system 100 uses and combines the lower dimension vectors associated with each vector, as described in more detail below.”
scaling the encoded input to decrease a first norm for an encoded output state of a first prototype input token …; Chaturvedi ¶ 0056, “In the stacked attention based context encoder 381 a weighted sum of the set of Q, K and V vectors for the context is performed to produce one set of Q, K and V vectors that pays attention to some aspect of semantics of the context. That is, the information in at least one of the Q/K/V projections is used to draw attention to certain semantics represented by one or more of the remaining projections.” Also ¶ 0059, “So, the attention module includes a layer normalization (Ba et al., 2016) applied in module 440 on the new joint representation obtained by Equation 2.”
Chaturvedi does not expressly disclose: that is similar to a first concept input token of the concept input tokens. However, this is taught by Adriaansen. See Adriaansen ¶ 0164-0166, “determining a similarity score between the plurality of concepts at 2203; and predicting that two or more of the plurality of concepts have a relationship wherein the overlap is above a first threshold and the similarity score is above a second threshold at 2204.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Adriaansen’s concept relationship with Chaturvedi’s tokens in order to determine conceptual relationships for providing relevant results as suggested by Adriaansen (see e.g. ¶ 0154-0155).
Chaturvedi also discloses:
generating a set of position indicators for input tokens of the input; Chaturvedi Fig. C, element 131, depicting a vector matrix providing a broad but reasonable interpretation of position indicators. Also see ¶ 0041, “each token 120a through 120d is embedded in a vector with corresponding elements a1 through aT, b1 through bT, c1 through cT, and di through dT, respectively, to produce a context matrix 131.”
decoding the scaled encoded output based on the set of position indicators to generate a model output; Chaturvedi Fig. 1C elements 140 and 141 along with ¶ 0042, “The artificial intelligence (AI) story generation module 140 produces one or more new column vectors 141, depicted in FIG. 1C as vectors with elements e1 through eT, ellipsis, and f1 through fT, respectively, which are associated with corresponding text tokens 120e through 120f, respectively, with ellipsis indicating zero or more intervening vectors and corresponding tokens.”
identifying, based on the generated model output, targeted content; and Chaturvedi ¶ 0042, “Following that, a Softmax function is used to convert the final vector representation into a probability distribution over the training vocabulary. The system then predicts/generates the most probable token according to the Softmax result.”
providing, to the computing device, the identified targeted content in response to the received indication. Chaturvedi ¶ 0042, “The tokens 120e through 120f provide the next portion 144 of text for the story. The next portion 144 is added to the new document 130, e.g., concatenated at the end of the new document 130 growing that document to an updated new document 130.”

In regard to claim 2, Chaturvedi also discloses:
2. The system of claim 1, wherein the prototype for the set of concepts is obtained based on a search result responsive to the received search query. ¶ 0042, “To determine the meaning of the context in matrix 131 and the cue phrase in matrix 151, the artificial intelligence story generation module 140 of system 100 uses and combines the lower dimension vectors associated with each vector,”

In regard to claim 3, Chaturvedi also discloses:
3. The system of claim 1, wherein generating the set of position indicators comprises: for each input token: when the input token is a concept input token, generating a position indicator of a first value; when the input token is a prototype input token that is similar to a concept input token, generating a position indicator of a second value that is greater than the first value; and when the input token is a prototype input token that is not similar to a concept input token, generating a position indicator of a third value that is greater than a position indicator value of a most proximate prototype input token that is similar to a concept input token. ¶ 0039, “To distinguish many nuances of semantics, each token 120 is embedded in a vector 122 of expansive dimension T, say 256 to 1024 dimensions. The formation of the vectors is controlled such that tokens with similar meanings are close together in the vector space of these T dimensions and tokens with unrelated meanings are far apart.”

In regard to claim 5, Chaturvedi also discloses:
5. The system of claim 2, wherein the search result responsive to the received search query is retrieved from the knowledge corpus. ¶ 0042, “uses and combines the lower dimension vectors associated with each vector.”

In regard to claim 6, Chaturvedi also discloses:
6. The system of claim 5, wherein the knowledge corpus is determined from a set of knowledge corpora based on the received search query. Fig. 1A, elements 110 and 111. Also ¶ 0042, “To determine the meaning of the context in matrix 131 and the cue phrase in matrix 151, the artificial intelligence story generation module 140 of system 100 uses and combines the lower dimension vectors associated with each vector.”

In regard to claim 7, Chaturvedi also discloses:
7. The system of claim 1, wherein the knowledge corpus is one of an in-domain knowledge corpus or an out-of-domain knowledge corpus. ¶ 0035, “The documents 111 for the corpus 110 are selected to be appropriate for the type of story to be told.”

In regard to claim 8, Chaturvedi discloses:
8. A system comprising: at least one processor; and memory storing instructions that, when executed by the at least one processor, causes the system to perform a set of operations, the set of operations comprising: See Chaturvedi, Fig. 7, depicting a system.
receiving a request comprising a set of concepts; See Chaturvedi ¶ 0036, “The cue phrase can be provided by any entity outside the system, such as a knowledge base or a human user.” ¶ 0041, “… generating new portions of a story based on a story already produced and cued input. … A starting portion (e.g., the first portion supplied initially, e.g., from a human, based on a title, or at random) includes one or more tokens represented in FIG. 1C by token 120a, token 120b, token 120c, ellipsis indicating one or more other tokens, and token 120d.”
generating a prototype for the set of concepts based on a knowledge corpus;  Chaturvedi Fig. 1A, element 110 “Training Corpus.” ¶ 0035, “The documents 111 for the corpus 110 are selected to be appropriate for the type of story to be told.”
encoding an input that comprises a set of input tokens, wherein the set of input tokens comprises concept input tokens of the set of concepts and prototype input tokens of the prototype; Chaturvedi ¶ 0042, “The artificial intelligence (AI) story generation module 140 produces one or more new column vectors 141, depicted in FIG. 1C as vectors with elements e1 through eT, ellipsis, and f1 through fT, respectively, which are associated with corresponding text tokens 120e through 120f, respectively, with ellipsis indicating zero or more intervening vectors and corresponding tokens. To determine the meaning of the context in matrix 131 and the cue phrase in matrix 151, the artificial intelligence story generation module 140 of system 100 uses and combines the lower dimension vectors associated with each vector, as described in more detail below.”
generating a set of position indicators for input tokens of the input, … Chaturvedi Fig. C, element 131, depicting a vector matrix providing a broad but reasonable interpretation of position indicators. Also see ¶ 0041, “each token 120a through 120d is embedded in a vector with corresponding elements a1 through aT, b1 through bT, c1 through cT, and di through dT, respectively, to produce a context matrix 131.”
… wherein each position indicator indicates a relative distance of an input token to a most proximate input token similar to a concept input token; ¶ 0039, “To distinguish many nuances of semantics, each token 120 is embedded in a vector 122 of expansive dimension T, say 256 to 1024 dimensions. The formation of the vectors is controlled such that tokens with similar meanings are close together in the vector space of these T dimensions and tokens with unrelated meanings are far apart.”
decoding the encoded output based on the set of position indicators to generate a model output; and Chaturvedi Fig. 1C elements 140 and 141 along with ¶ 0042, “The artificial intelligence (AI) story generation module 140 produces one or more new column vectors 141, depicted in FIG. 1C as vectors with elements e1 through eT, ellipsis, and f1 through fT, respectively, which are associated with corresponding text tokens 120e through 120f, respectively, with ellipsis indicating zero or more intervening vectors and corresponding tokens.”
providing, in response to the request, the generated model output. Chaturvedi ¶ 0042, “The tokens 120e through 120f provide the next portion 144 of text for the story. The next portion 144 is added to the new document 130, e.g., concatenated at the end of the new document 130 growing that document to an updated new document 130.”

In regard to claims 9-10, parent claim 8 is addressed above.
All further limitations of claims 9-10 have been addressed in the above rejections of claims 1 and 7, respectively. 

In regard to claim 11, Chaturvedi discloses
11. A method for generating a model output based on a set of concepts, the method comprising: See Fig. 3A, depicting a flow chart method.
All further limitations of claim 11 have been addressed in the above rejections of claims 1 and 8. 

In regard to claim 12, parent claim 11 is addressed above.
	All further limitations of claim 12 have been addressed in the above rejection of claim 1.

In regard to claim 14, parent claim 11 is addressed above. 
All further limitations of claim 14 have been addressed in the above rejection of claim 7.

In regard to claim 15, Chaturvedi also discloses:
15. The method of claim 14, wherein the knowledge corpus is determined from a set of knowledge corpora based on the set of concepts. Fig. 1A, elements 110 and 111. Also ¶ 0042, “To determine the meaning of the context in matrix 131 and the cue phrase in matrix 151, the artificial intelligence story generation module 140 of system 100 uses and combines the lower dimension vectors associated with each vector.”


Claim(s) 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chaturvedi in view of Adriaansen as applied above, and further in view of U.S. Patent Application Publication 20110191175 by Elbaz et al. ("Elbaz").

In regard to claim 4, Chaturvedi does not expressly disclose: 
4. The system of claim 3, wherein the third value is linearly determined based on a distance to the most proximate prototype input token that is similar to the concept input token. However, this is taught by Elbaz. See Elbaz ¶ 0055 “FIG. 4 also shows semantic distances between elements. "Ski" and "skiing" have only a distance of 2 between them while "skiing" and "sport" have a distance of 5 (7-2). The distance between "ski" and "sport" is 7. When traveling from parent to child or vice-versa, the distances can be simply added/subtracted but when changing the direction of travel, a penalty may be imposed upon the distance calculation.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Elbaz’ distance calculation with Chaturvedi’s tokens in order to quantify and understand meanings within a semantic space as suggested by Elbaz (see ¶ 0009 and ¶ 0054).


Claim(s) 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chaturvedi in view of Adriaansen as applied above, and further in view of U.S. Patent Application Publication 20180174020 by Wu et al. ("Wu").

In regard to claim 13, Chaturvedi does not expressly disclose the limitations. However, they are taught as follows:
13. The method of claim 11, further comprising: receiving, from a computing device, the set of concepts as keywords associated with targeted content; and See Wu, ¶ 0001, “A chat bot may utilize sophisticated natural language processing systems or scan for keywords from a user input and then pull a reply with the most matching keywords or the most similar wording pattern from a database.”
storing the model output as one of a descriptive headline or descriptive summary associated with the targeted content. See Wu, ¶ 0084, “The context summary system 112 utilizes a learning algorithm, a vector system, and/or a feed-forward neural network language model to characterize and summarize the context information to some impact short sentence (with only high term weight words left) to help make a better context-sensitive classification/understanding of user's emotions.”
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Wu’s use of keywords and content summary with Chaturvedi’s concepts in order to help make better context-sensitive classification/understanding associated with a search as suggested by Wu (see ¶ 0084).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
“Domain Specific Corpora from the Web” by PVS et al. See PVS, section 1 on p. 336, “Language usage is dependent on domain (Hanks, 2000) and domain specific corpora are consequently extremely useful for language learning and lexicography (Barrière, 2009; Drouin, 2004). It is possible to label heterogeneous data for domain either manually (Atkins et al., 2010) or automatically (for a survey see (Sebastiani, 2002)) using human knowledge or machine learning.” 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to James D Rutten whose telephone number is (571)272-3703. The examiner can normally be reached M-F 9:00-5:30 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached at (571)272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/James D. Rutten/Primary Examiner, Art Unit 2121

Read full office action

Prosecution Timeline

May 08, 2023

Application Filed

Jan 16, 2026

Non-Final Rejection mailed — §103

Mar 10, 2026

Applicant Interview (Telephonic)

Mar 10, 2026

Examiner Interview Summary

Apr 16, 2026

Response Filed

Precedent Cases

Applications granted by this same examiner with similar technology

17/005,040

Patent 12614061

PIPELINING SPIKES DURING MEMORY ACCESS IN SPIKING NEURAL NETWORKS

5y 8m to grant Granted Apr 28, 2026

17/234,752

Patent 12608519

TOOL FOR DESIGNING ARTIFICIAL INTELLIGENCE SYSTEMS

5y 0m to grant Granted Apr 21, 2026

17/075,501

Patent 12579423

SYSTEMS AND METHODS FOR PREDICTING BIOLOGICAL RESPONSES

5y 4m to grant Granted Mar 17, 2026

17/122,738

Patent 12555004

PATH-SUFFICIENT EXPLANATIONS FOR MODEL UNDERSTANDING

5y 2m to grant Granted Feb 17, 2026

17/212,464

Patent 12541707

METHOD AND SYSTEM FOR DEVELOPING A MACHINE LEARNING MODEL

4y 10m to grant Granted Feb 03, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

63%

Grant Probability

99%

With Interview (+38.7%)

4y 1m (~1y 0m remaining)

Median Time to Grant

Low

PTA Risk

Based on 583 resolved cases by this examiner. Grant probability derived from career allowance rate.