Prosecution Insights
Last updated: April 19, 2026
Application No. 18/507,953

MULTI-MODALITY SYSTEM FOR RECOMMENDING MULTIPLE ITEMS USING INTERACTION AND METHOD OF OPERATING THE SAME

Final Rejection §101§103
Filed
Nov 13, 2023
Examiner
HASAN, SYED HAROON
Art Unit
2154
Tech Center
2100 — Computer Architecture & Software
Assignee
ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
OA Round
2 (Final)
82%
Grant Probability
Favorable
3-4
OA Rounds
3y 2m
To Grant
97%
With Interview

Examiner Intelligence

Grants 82% — above average
82%
Career Allow Rate
597 granted / 732 resolved
+26.6% vs TC avg
Strong +16% interview lift
Without
With
+15.5%
Interview Lift
resolved cases with interview
Typical timeline
3y 2m
Avg Prosecution
39 currently pending
Career history
771
Total Applications
across all art units

Statute-Specific Performance

§101
18.3%
-21.7% vs TC avg
§103
34.8%
-5.2% vs TC avg
§102
20.8%
-19.2% vs TC avg
§112
21.1%
-18.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 732 resolved cases

Office Action

§101 §103
DETAILED ACTION Case Status This office action is in response to remarks and amendments of 8 October 2025. Claims 1-11, 14-16 and 18-20 have been examined. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1-11, 14-16 and 18-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Claims 1-11, 14-16 and 18-20 are directed to one of the eligible categories of subject matter. With respect to independent claims 1, 14, 18, the preprocesses, converts, configures, evaluate, calculate, classify cover performance of the limitations manually and/or in the mind (mental processes abstract idea). The output, input, receives, transmitted limitations are recited at a high level of generality and do not add meaningful limitations to the abstract idea; these limitations are directed to insignificant extra solution activities. The claims as a whole merely describe how to generally “apply” the exception in a computer environment using generic computer functions or components (such as the claimed neural network model). Even when viewed in combination, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible. With respect to dependent claim 10, the filtering, connects, convert cover performance of the limitations manually and/or in the mind (mental processes abstract idea). The pre-trained language model is recited at a high level of generality and do not add meaningful limitations to the abstract idea. The claims as a whole merely describe how to generally “apply” the exception in a computer environment using generic computer functions or components (such as the pre-trained language model). Even when viewed in combination, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible. With respect to dependent claims 4, 7, 8, 9, 11, 15, 16, 19 the assign, cluster, divide, generating, concatenates, separates, converts, expressed, evaluate, calculate, classify, preprocessing cover performance of the limitations manually and/or in the mind (mental processes abstract idea). No additional elements are recited and so the claims do not provide a practical application and are not considered to be significantly more. The claims are not eligible. With respect to dependent claims 2, 3, 5, 6, 20 single neural network, transformer, output, includes information, item type are recited at a high level of generality and do not add meaningful limitations to the abstract idea. The claims as a whole merely describe how to generally “apply” the exception in a computer environment using generic computer functions or components (such as a single neural network, transformer). Even when viewed in combination, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention. Claims 1, 2, 3, 9, 10, 14, 16, 18, 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Wu et al., Multimodal Conversational Fashion Recommendation with Positive and Negative Natural-Language Feedback, hereinafter Wu in view of Bhaskaran et al., Pub. No.: US 20200309923 A1, hereinafter Bhaskaran. As per claim 1, Wu discloses A multi-modality system for recommending multiple items using an interaction, comprising: an interaction data preprocessing module that preprocesses an interaction data set and converts the preprocessed interaction data set into interaction training data (see section 1, at least the first and last two paragraphs, and section 2, at least the last paragraph, section 3; the dialog is tracked turn-by-turn as the models input state, BERT text feature extractor encodes text attributes); an item data preprocessing module that preprocesses item information data and converts the preprocessed item information data into item training data (see Wu as mapped above; the image feature extractor encodes item images and BERT text feature extractor encodes text attributes); a learning module that includes a neural network model that is trained using the interaction training data and the item training data and outputs a result including a set of recommended items using a conversation context with a user as input (see rejection of above limitations; note that in Wu, a transformer based neural network model trains on both the dialogue data and item feature data and outputs recommended items in response to the user’s conversational inputs. Note: the claimed conversation context is merely chat or dialogue as explained in par. 55 of the published specification); and an evaluation module that evaluates the set of recommended items, wherein the evaluation module is configured to calculate a confidence score for two inputs that are the conversation context with the user and each item or two items included in one set of recommended items (sections 5, 6 including Fig. 3, 5 and Table 1), and Wu does not explicitly disclose, however Bhaskaran in the related field of endeavor of machine learning wherein the evaluation module is trained to classify the two inputs as true/false through a binary classifier, and the confidence score is based on a logit value of the binary classifier (Bhaskaran, pars. 128-133). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Bhaskaran would have allowed Wu to implement the well-known technique of using a binary classifier ML model to output a confidence score indicating a likelihood that an input is associated with a classification, the confidence score indicated as a logit (Bhaskaran, pars. 128-133). As per claim 14, it is analogous to claim 1 and therefore likewise rejected. As per claim 18, Wu discloses A multi-modality system for recommending multiple items using an interaction, comprising: a user device that receives a conversation for item recommendation input from a user; and an item recommendation system that configures the conversation input from the user device and an answer transmitted to the user device into a series of conversation contexts, inputs the conversation contexts to a pre-trained neural network model, and outputs a result including a set of recommended items, (see section 1, at least the first and last two paragraphs, and section 2, at least the last paragraph, section 3 - a transformer based neural network model trains on both a user’s device-inputted/device-transmitted “answer” dialogue data and item feature data and outputs recommended items) wherein the item recommendation system comprises an evaluation module that evaluates the set of recommended items, wherein the evaluation module is configured to calculate a confidence score for two inputs that are the conversation context with the user and each item or two items included in one set of recommended items (sections 5, 6 including Fig. 3, 5 and Table 1), and Wu does not explicitly disclose, however Bhaskaran in the related field of endeavor of machine learning wherein the evaluation module is trained to classify the two inputs as true/false through a binary classifier, and the confidence score is based on a logit value of the binary classifier (Bhaskaran, pars. 128-133). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Bhaskaran would have allowed Wu to implement the well-known technique of using a binary classifier ML model to output a confidence score indicating a likelihood that an input is associated with a classification, the confidence score indicated as a logit (Bhaskaran, pars. 128-133). As per claim 2, Wu as modified discloses the multi-modality system of claim 1, wherein the neural network model is a single neural network that processes the interaction training data and the item training data (See Wu as cited in the rejection of claim 1- the Multimodal Interactive Transformer (MIT) model is a single NN that uses a transformer). As per claim 3, Wu as modified discloses the multi-modality system of claim 2, wherein the neural network is based on a transformer (See Wu as cited in the rejection of claim 1- the Multimodal Interactive Transformer (MIT) model is a single NN that uses a transformer). As per claim 9, Wu as modified discloses The multi-modality system of claim 1, wherein the item data preprocessing module separates the item information data into text information data and non-text information data, converts the text information data into a text feature, and converts the non-text information data into a non-text feature (see Wu as cited in the rejection of claim 1 including section 4, second paragraph and Table 2 of section 6; also, section 6.4 discloses that the system understands textual dataset names of the images and that text about the images such as color information is understood as a non-text feature in all subsequent interactions of the State Tracking/history as disclose in section 3.2). As per claim 16, it is analogous to claim 9 and therefore likewise rejected. As per claim 10, Wu as modified discloses The method of claim 9, wherein the item data preprocessing module performs filtering on the text information data, connects the filtered text information data to convert into one string sequence, and uses a pre-trained language model to convert the string sequence into the text feature (see rejection of claim 9 and Wu section 3.2). As per claim 19, Wu as modified discloses The multi-modality system of claim 18, wherein the neural network model is trained based on item training data by preprocessing the interaction data set and preprocessing interaction training data and item information data (see rejection of claim 1). As per claim 20, Wu as modified discloses The multi-modality system of claim 18, wherein the item is one of clothes, a movie, music, travel, or a book (Sections 6.4 and 7). Claim 4, 15 are rejected under 35 U.S.C. 103 as being unpatentable over Wu as modified and further in view of Basu et al., Pub. No.: US 20150044659 A1, hereinafter Basu. As per claim 4, Wu as modified discloses The multi-modality system of claim 1 wherein the interaction data preprocessing module assigns interaction state information to each utterance in the conversation context with the user (see rejection of claim 1 including at least section 3.2 for language/utterance state tracking). Wu as modified does not explicitly disclose, however Basu in the related field of endeavor of question answering discloses and clusters system utterance (Basu, pars. 48-59), to divide the system utterances into a plurality of answer sets (Basu, at least pars. 20, 21, 52, 72). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Basu would have allowed Wu to improve grouping of similar answers into clusters in order to provide more rich feedback, discover modalities, and more efficiently/rapidly encode the same or similar answers (Basu, pars. 18-19). Analogous claim 15 is likewise rejected. Claims 4, 5, 6, 7, 8, 11, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Wu as modified and further in view of Katayama et al., Pub. No.: US 20220261556, hereinafter Katayama. As per claim 4, Wu as modified discloses The multi-modality system of claim 1 wherein the interaction data preprocessing module assigns interaction state information to each utterance in the conversation context with the user see rejection of claim 1 including at least section 3.2 for language/utterance state tracking). Wu as modified does not explicitly disclose, however Katayama in the related field of endeavor of natural language processing discloses and clusters system utterances to divide the system utterances into a plurality of answer sets (Katayama, par. 23, 36, 37, 39-41 discloses clustering utterances and/or interrogatives as those that are the top N (a first cluster) and those that are not (a second cluster), and store/use interrogative answers based on rank-based calculated scores (i.e. divide)). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Katayama would have allowed Wu to “Returning a response utterance utilizing the collected information to the user facilitates the user to have a conversation with the interaction system, and hence, smooth interaction can be expected between the interaction system and the user” (Katayama, par. 24). Analogous claim 15 is likewise rejected. As per claim 5, Wu as modified discloses the multi-modality system of claim 4, wherein the learning module further outputs information on an answer utterance of the system as the result (Katayama, pars. 23, 36, 37, 39-41 disclose outputting utterances as answer results; see also, Wu, section 4, and Table 2 of section 6). As per claim 6, Wu as modified discloses the multi-modality system of claim 5, wherein the information on the answer utterance includes previous interaction state information of a current input sequence, interaction state information of an answer of the system to be currently predicted, and identification information of the answer set (Katayama, pars. 23, 36, 37, 39-41; also, Wu discloses tracking in at least section 3.2). As per claim 7, Wu as modified discloses the multi-modality system of claim 6, wherein the learning module further includes a decoder for generating an answer sentence based on the identification information of the answer set (Katayama, pars. 36-38; see also, Wu section 3.2). As per claim 8, Wu as modified discloses The multi-modality system of claim 4, wherein the interaction data preprocessing module concatenates similar sentences among the system utterances in the answer set into one sentence (Katayama, at least pars. 37-38 disclose concatenating as adding to the record of the estimated used interrogatives (system answer sentence utterances)). As per claim 11, Wu as modified discloses The method of claim 7, wherein each item included in the set of recommended items is expressed as composite modality of a text feature and a non-text feature (Wu, section 4 and Table 2 in section 6.3). Response to Arguments Applicant's arguments filed 8 October 2025 have been fully considered but they are not persuasive. With respect to the 35 USC 101 rejection, the remarks present the following: PNG media_image1.png 406 620 media_image1.png Greyscale Examiner is unable to identify which words in the independent claims are directed to any of these features. More specifically, claim 1 includes the term “multi-modality” in the preamble only and only as a type of label for the claimed system. There are no words in the claim requiring “multi-modal feature extraction and fusion”. The claim does not require text features and image features much less separately computed from different models. The claim does not combine anything much less text and image features “via learned logit weighting after FCN processing”. The claim does not concurrently evaluate anything much less (a) context-item pairs and (b) item-item pairs within a candidate set, producing a logit-based confidence score for each stream. The claim does not do set level scoring, N-best re-ranking, combine logits, summation of combined logits across items and pairs to form a set score in order to re-rank an N-best list generated by a trained transformer-based recommender. The remarks further present the following: PNG media_image2.png 240 637 media_image2.png Greyscale As mentioned above, the independent claims do not recite any such architecture and operations. Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). Accordingly, Applicant’s arguments directed to the 35 USC 101 rejection are not persuasive. With respect to the prior art rejection, the remarks present: PNG media_image3.png 211 625 media_image3.png Greyscale PNG media_image4.png 172 621 media_image4.png Greyscale PNG media_image5.png 241 628 media_image5.png Greyscale As is evident from these statements, Applicant’s interpretation of the independent claims requires reading entire disclosed embodiments into the claims. Although the claims are interpreted in light of the specification, limitations from the specification will not be read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). Conclusion THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to SYED HASAN whose telephone number is (571)270-5008. The examiner can normally be reached M-F 8am - 5 pm. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Boris Gorney can be reached at (571)270-5626. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. Syed Hasan Primary Examiner Art Unit 2154 /SYED H HASAN/Primary Examiner, Art Unit 2154
Read full office action

Prosecution Timeline

Nov 13, 2023
Application Filed
Jul 03, 2025
Non-Final Rejection — §101, §103
Oct 08, 2025
Response Filed
Dec 15, 2025
Final Rejection — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12602423
REAL-TIME NORMALIZATION OF RAW ENTERPRISE DATA FROM DISPARATE SOURCES
2y 5m to grant Granted Apr 14, 2026
Patent 12591662
SECURITY MARKER INJECTION FOR LARGE LANGUAGE MODELS
2y 5m to grant Granted Mar 31, 2026
Patent 12566589
SYSTEM AND METHOD FOR DETERMINING DATA FEED SOURCES FOR INTERACTIVE AUTOMATED CODE GENERATION AND MODIFICATION
2y 5m to grant Granted Mar 03, 2026
Patent 12561352
OPTIMIZING PUBLICATION AND SUBSCRIPTION EXPRESSIVENESS
2y 5m to grant Granted Feb 24, 2026
Patent 12554759
RECOMMENDATION GENERATION USING USER INPUT
2y 5m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
82%
Grant Probability
97%
With Interview (+15.5%)
3y 2m
Median Time to Grant
Moderate
PTA Risk
Based on 732 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month