Prosecution Insights
Last updated: April 19, 2026
Application No. 18/148,735

SYSTEMS AND METHODS FOR SEQUENTIAL RECOMMENDATION WITH CASCADE-GUIDED ADVERSARIAL TRAINING

Non-Final OA §103
Filed
Dec 30, 2022
Examiner
KASSIM, IMAD MUTEE
Art Unit
2129
Tech Center
2100 — Computer Architecture & Software
Assignee
Salesforce Inc.
OA Round
1 (Non-Final)
72%
Grant Probability
Favorable
1-2
OA Rounds
3y 8m
To Grant
99%
With Interview

Examiner Intelligence

Grants 72% — above average
72%
Career Allow Rate
116 granted / 160 resolved
+17.5% vs TC avg
Strong +34% interview lift
Without
With
+33.8%
Interview Lift
resolved cases with interview
Typical timeline
3y 8m
Avg Prosecution
23 currently pending
Career history
183
Total Applications
across all art units

Statute-Specific Performance

§101
22.6%
-17.4% vs TC avg
§103
44.2%
+4.2% vs TC avg
§102
11.8%
-28.2% vs TC avg
§112
12.9%
-27.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 160 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim(s) 1-2, 4-10, 12-18 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over He et al. (“Adversarial Personalized Ranking for Recommendation”, Published: 27 June 2018, pg. 355-364) in view of Kang et al. (“Self-Attentive Sequential Recommendation”, 2018 IEEE). Regarding claim 1. He teaches a method for training a sequential recommendation model, the method comprising: see page 363, “Due to the abundance of user feedback such ratings and purchases that can directly reflect a user’s preference, research on item recommendation have mainly focused on mining the feedback data, known as collaborative filtering (CF). Among the various CF methods, matrix factorization (MF) [18], a special type of latent factor models, is a basic yet most effective recommender model.”); generating, via the sequential recommendation model, a recommendation score relating to a target item for the user (see page 356, “Formally, let u denote a user and i denote an item, then the predictive model of MF is formulated as: PNG media_image1.png 28 134 media_image1.png Greyscale , where pu ∈ RK and qi ∈ RK denote the embedding vector for user u and item i, respectively, and K is the size of embedding vector also called as embedding size. Θ denotes the model parameters of MF, which is consisted of all user embedding and item embedding vectors, i.e., Θ = {{pu}u∈U, {qi}i∈I}, where U and I denote the set of all users and items, respectively. We use P and Q to denote the embedding matrix P = {pu}u∈U,Q = {qi}i∈I for short.”, i.e. yui== user embedding, qi == recommendation score relating to a target item for the user); computing a first loss based on the generated recommendation score (see page 356, “Bayesian Personalized Ranking (BPR) maximizes the margin between an observed interaction and its unobserved counterparts. This is fundamentally different from pointwise methods[2, 17] that optimize each model prediction towards a predefined ground truth. Formally, the objective function (to be minimized) of BPR is PNG media_image2.png 50 468 media_image2.png Greyscale … It is worth noting that the behavior of BPR can be interpreted as a classifier — given a triplet (u,i, j), it determines whether (u,i) should have a higher score than (u,j). ”); perturbing embeddings of the items based on the respective cascade scores (see page 359, “The solution is simple and straightforward— we first train MF with BPR, and then further optimize it under our APR framework. We term the method as Adversarial Matrix Factorization (AMF). Figure 2 illustrates our AMF method. Since the parameters of MF are embedding vectors for users and items, we apply adversarial perturbations on the embedding vector. Given a (u, i ) pair, the predictive model with perturbations is defined as PNG media_image3.png 40 434 media_image3.png Greyscale ", also see Figure 2: Illustration of our AMF method. The perturbations Δ are enforced on each embedding vector of user and item, also see page 357, section 3.1); computing a second loss based on the perturbed embeddings of the items (see page 357, “Due to the rationality of the BPR pairwise objective in personalized ranking, we choose it as the building block. To enhance the robustness, we enforce the model to perform well even when the adversarial perturbations (defined in Equation (2)) are presented. To achieve this, we additionally optimize the model to minimize the BPR objective function with the perturbed parameters. Formally, we define the objective function of adversarial personalized ranking as follows: PNG media_image4.png 70 426 media_image4.png Greyscale ”); and updating parameters of the sequential recommendation model based on the first loss and the second loss via backpropagation (see page 358, “We can see that, similar to BPR, our formulation of APR leads to a general learning framework which is model independent. As long as the underlying model ˆyui(Θ) is differentiable, it can be learned under our APR framework using backpropagation and gradient based optimization algorithms. There are two hyper-parameters — ϵ and λ —to be specified in APR in addition to the ones in BPR. In what follows, we propose a generic solution for APR based on SGD… PNG media_image5.png 422 494 media_image5.png Greyscale ”). He do not specifically teach receiving, via a data interface, a training dataset comprising a plurality of user interaction sequences for a user; and computing, for at least one of the plurality of user interaction sequences, respective cascade scores for items in the at least one of the plurality of user interaction sequences based on a sequence position and the presence of the items in other sequences of the plurality of user interaction sequences. Kang teaches receiving, via a data interface, a training dataset comprising a plurality of user interaction sequences for a user (see 198, “Sequential recommendation (or next-item recommendation) differs slightly from this setting, as it only considers the order of actions, and models sequential patterns which are independent of time. Essentially, sequential models try to model the ‘context’ of users’ actions based on their recent activities, rather than considering temporal patterns per se.”, also see page 199, “In the setting of sequential recommendation, we are given a user’s action sequence Su = (Su 1 , Su2 , . . . , Su |Su|), and seek to predict the next item. During the training process, at time step t, the model predicts the next item depending on the previous t items. As shown in Figure 1, it will be convenient to think of the model’s input as (Su 1 , Su 2 , . . . , Su |Su|−1) and its expected output as a ‘shifted’ version of the same sequence: (Su 2 , Su 3 , . . . , Su |Su|).” ); and computing, for at least one of the plurality of user interaction sequences, respective cascade scores for items in the at least one of the plurality of user interaction sequences based on a sequence position and the presence of the items in other sequences of the plurality of user interaction sequences (see page 199, “Positional Embedding: As we will see in the next section, since the self-attention model doesn’t include any recurrent or convolutional module, it is not aware of the positions of previous items. Hence we inject a learnable position embedding P ∈ Rn×d into the input embedding…”, also see page 201, “Factorized Item Similarity Models [8]: FISM estimates a preference score toward item i by considering the similarities between i and items the user consumed before…If we use one self-attention layer (excluding the feedforward network), set uniform attention weights (i.e., 1 |Su| ) on items, use unshared item embeddings, and remove the position embedding, SASRec is reduced to FISM. Thus our model can be viewed as an adaptive, hierarchical, sequential item similarity model for next item recommendation.”). Both He and Kang pertain to the problem of item recommendation systems, thus being analogous. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to combine He and Kang to teach the above limitations. The motivation for doing so would be “To capture such patterns, two approaches have proliferated: Markov Chains (MCs) and Recurrent Neural Networks (RNNs)… At each time step, SASRec seeks to identify which items are ‘relevant’ from a user’s action history, and use them to predict the next item. Extensive empirical studies show that our method outperforms various state-of-the-art sequential models (including MC/CNN/RNN-based approaches) on both sparse and dense datasets. Moreover, the model is an order of magnitude more efficient than comparable CNN/RNN-based models. Visualizations on attention weights also show how our model adaptively handles datasets with various density, and uncovers meaningful patterns in activity sequences.” (see Kang abstract). Regarding claim 2. He and Kang teaches the method of claim 1, Kang further teaches wherein the respective cascade scores are computed by assigning a higher score to items in earlier positions in their respective user interaction sequences (see page 200, “a high interaction score ri,t means a high relevance, and we can generate recommendations by ranking the scores.”, also see page 201, “Factorized Item Similarity Models [8]: FISM estimates a preference score toward item i by considering the similarities between i and items the user consumed before…If we use one self-attention layer (excluding the feedforward network), set uniform attention weights (i.e., 1 |Su| ) on items, use unshared item embeddings, and remove the position embedding, SASRec is reduced to FISM. Thus our model can be viewed as an adaptive, hierarchical, sequential item similarity model for next item recommendation.”). The motivation utilized in the combination of claim 1, super, applies equally as well to claim 2. Regarding claim 4. He and Kang teaches the method of claim 1, He further teaches the computing the second loss further comprising: generating a first user embedding based on embeddings of the items; generating a second user embedding based on the perturbed embeddings of the items; and computing the second loss by computing a difference between the first user embedding and the second user embedding (see page 357, “Due to the rationality of the BPR pairwise objective in personalized ranking, we choose it as the building block. To enhance the robustness, we enforce the model to perform well even when the adversarial perturbations (defined in Equation (2)) are presented. To achieve this, we additionally optimize the model to minimize the BPR objective function with the perturbed parameters. Formally, we define the objective function of adversarial personalized ranking as follows: PNG media_image4.png 70 426 media_image4.png Greyscale ”). Regarding claim 5. He and Kang teaches the method of claim 1, He further teaches wherein the perturbing the embeddings of the items includes: determining a worst case perturbation direction based on a gradient of a function defining the second loss, and perturbing the embeddings of the items based on the determined worst case perturbation direction (see page 358, “We can see that, similar to BPR, our formulation of APR leads to a general learning framework which is model independent. As long as the underlying model ˆyui(Θ) is differentiable, it can be learned under our APR framework using backpropagation and gradient based optimization algorithms. There are two hyper-parameters — ϵ and λ —to be specified in APR in addition to the ones in BPR. In what follows, we propose a generic solution for APR based on SGD.”). Regarding claim 6. He and Kang teaches the method of claim 1, He further teaches further comprising: generating a second recommendation score using at least one of: a perturbed user embedding; a perturbed target item embedding; or a perturbed negative item embedding; and computing a third loss based on the second recommendation score, wherein the updating parameters is further based on the third loss (see page 357, “Due to the rationality of the BPR pairwise objective in personalized ranking, we choose it as the building block. To enhance the robustness, we enforce the model to perform well even when the adversarial perturbations (defined in Equation (2)) are presented. To achieve this, we additionally optimize the model to minimize the BPR objective function with the perturbed parameters. Formally, we define the objective function of adversarial personalized ranking as follows: PNG media_image4.png 70 426 media_image4.png Greyscale ”, i.e. multiple losses are computed and model is evaluated with perturbed parameters). Regarding claim 7. He and Kang teaches the method of claim 6, He further teaches wherein at least one of the perturbed user embedding, perturbed target item, or perturbed negative item embedding is perturbed by determining a worst case perturbation direction based on one or more gradients of a function defining the third loss (see page 357, “Due to the rationality of the BPR pairwise objective in personalized ranking, we choose it as the building block. To enhance the robustness, we enforce the model to perform well even when the adversarial perturbations (defined in Equation (2)) are presented. To achieve this, we additionally optimize the model to minimize the BPR objective function with the perturbed parameters. Formally, we define the objective function of adversarial personalized ranking as follows: PNG media_image4.png 70 426 media_image4.png Greyscale ”). Regarding claim 8. He and Kang teaches the method of claim 6, He further teaches wherein updating the parameters is based on a combined loss objective including a weighted sum of the first loss, the second loss, and the third loss (see page 357, “Due to the rationality of the BPR pairwise objective in personalized ranking, we choose it as the building block. To enhance the robustness, we enforce the model to perform well even when the adversarial perturbations (defined in Equation (2)) are presented. To achieve this, we additionally optimize the model to minimize the BPR objective function with the perturbed parameters. Formally, we define the objective function of adversarial personalized ranking as follows: PNG media_image4.png 70 426 media_image4.png Greyscale ”). Claims 9-10 and 12-16 recites a system to perform the method recited in claims 1-2 and 4-8. Therefore the rejection of claims 1-2 and 4-8 above applies equally here. Kang also teaches the addition elements of claim 9 not recited in claim 1 comprising a memory that stores the sequential recommendation model and a plurality of processor executable instructions (see Kang, page 197, Figure 1 shows computer etc. and a simplified diagram showing the training process of SASRec. At each time step, the model considers all previous items, and uses attention to ‘focus on’ items relevant to the next action). Claims 17-18 and 20 recites a non-transitory machine-readable medium to perform the method recited in claims 1-2 and 4. Therefore the rejection of claims 1-2 and 4 above applies equally here. Kang also teaches the addition elements of claim 17 not recited in claim 1 comprising a plurality of machine-executable instructions which, when executed by one or more processors (see Kang, page 197, Figure 1 shows computer etc. and a simplified diagram showing the training process of SASRec. At each time step, the model considers all previous items, and uses attention to ‘focus on’ items relevant to the next action). Allowable Subject Matter Claims 3, 11 and 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. Related prior arts: Vitaladevuni et al. (US 9443198 B1) teaches conventional approach to detection is to process a sample through a cascade classifier, also referred to herein as a detection cascade or simply as a cascade, that includes a sequence of classifiers that score the sample on how likely it is to contain the detection target. At each stage of the cascade, a decision is made to either discard the sample under consideration or pass it on to the next stage. Ma et al. (US 20230281448 A1) teaches first recommendation scores of the plurality of recommendation dimensions based on the mapping feature to obtain a fusion feature, and performing recommendation score prediction processing on the to-be-recommended information based on the fusion feature to obtain a second recommendation score of the target object for the to-be-recommended information; and executing a recommendation operation of the to-be-recommended information corresponding to the target object based on the second recommendation score. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to IMAD M KASSIM whose telephone number is (571)272-2958. The examiner can normally be reached 10:30AM-5:30PM, M-F (E.S.T.). Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J. Huntley can be reached at (303) 297 - 4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /IMAD KASSIM/Primary Examiner, Art Unit 2129
Read full office action

Prosecution Timeline

Dec 30, 2022
Application Filed
Mar 19, 2026
Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12596923
MACHINE LEARNING OF KEYWORDS
2y 5m to grant Granted Apr 07, 2026
Patent 12572843
AGENT SYSTEM FOR CONTENT RECOMMENDATIONS
2y 5m to grant Granted Mar 10, 2026
Patent 12572854
ROOT CAUSE DISCOVERY ENGINE
2y 5m to grant Granted Mar 10, 2026
Patent 12566980
SYSTEM AND METHOD HAVING THE ARTIFICIAL INTELLIGENCE (AI) ALGORITHM OF K-NEAREST NEIGHBORS (K-NN)
2y 5m to grant Granted Mar 03, 2026
Patent 12566861
IDENTIFYING AND CORRECTING VULNERABILITIES IN MACHINE LEARNING MODELS
2y 5m to grant Granted Mar 03, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
72%
Grant Probability
99%
With Interview (+33.8%)
3y 8m
Median Time to Grant
Low
PTA Risk
Based on 160 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month