Office Action Analysis: 18318524 — TRANSFORMER MODEL FOR JOURNEY SIMULATION AND PREDICTION

Office Action

§101 §102 §103 §DP
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Drawings
The drawings were received on 07/13/2023.  These drawings are acceptable.
Specification
The disclosure is objected to because of the following informalities: page 1, column 1, line 3, Business should be plural; page 1 column 2, line 55, entering and/or existing should be entering and/or exiting.  
Appropriate correction is required.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-3,6,7,9, and 15 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1: Independent claims 1, 9, and 15 are directed towards a method, a manufacture, and a system respectively. Therefore, these claims, as well as their dependent claims, are directed towards one of the four statutory categories (process, machine (i.e. system), manufacture, or composition of matter).
	With respect to claim 1:
		Step 2A Prong 1:
			“Simulating, …, customer interaction with the input journey” (mental process, simulating customer interaction can be performed in the human mind, or by a human using a pen and paper – see MPEP 2106.04(a)(2)(III)).
		Step 2A Prong 2: The additional elements recited in the claim do not integrate the judicial exception into a practical application.
		Additional elements:
			“obtaining, by a model training engine, training data from a plurality of journeys, each journey comprising a sequence of events, wherein the training data from each journey indicates customer interaction with each event in the sequence of events of the journey;” (adding insignificant extra-solution activity to the judicial exception – mere data gathering, see MPEP 2106.05(g));
			“training, by the model training engine using the training data, a machine learning model to simulate customer interaction with an input journey comprising an input sequence of events;” (adding insignificant extra-solution activity to the judicial exception – see Recentive v. Fox);
			“, by the trained machine learning model,” (mere instructions to apply the exception using a generic component);
			“causing, by a user interface component, display of results of the simulated customer interaction with the input journey;” (mere instructions to apply the exception using a generic computer component).
		Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
		Additional elements:
			“obtaining, by a model training engine, training data from a plurality of journeys, each journey comprising a sequence of events, wherein the training data from each journey indicates customer interaction with each event in the sequence of events of the journey;” (MPEP 2106.05(d)(II)(iv) indicates that merely storing and retrieving information in memory is a well-understood, routine, and conventional function when it is claimed in a merely generic manner (as it is in the present claim));
			“training, by the model training engine using the training data, a machine learning model to simulate customer interaction with an input journey comprising an input sequence of events;” (RECENTIVE ANALYTICS, INC., v. FOX CORP., FOX BROADCASTING COMPANY, LLC, FOX SPORTS PRODUCTIONS, LLC, indicates that training a machine learning model is a well-understood, routine, and conventional function (pages 8-9) when it is claimed in a generic manner (as it is in the present claim));
			“, by the trained machine learning model,” (mere instructions to apply the exception using a generic computer component– see MPEP 2106.05(f));
			“causing, by a user interface component, display of results of the simulated customer interaction with the input journey;” (mere instructions to apply the exception using a generic computer component– see MPEP 2106.05(f)).
	With respect to claim 2:
Step 2A Prong 1:
			“The computer-implemented method of claim 1, wherein the results of the simulated customer interaction with the input journey predicts a probability of customers triggering one or more events of the input sequence of events of the input journey.” (mental process, predicting a probability can be performed in the human mind, or by a human using a pen and paper – see MPEP 2106.04(a)(2)(III)).
		Step 2A Prong 2: The additional elements recited in the claim do not integrate the judicial exception into a practical application as there are no additional elements.
		Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception as there are no additional elements.
	With respect to claim 3:
Step 2A Prong 1: 
			“The computer-implemented method of claim 1, wherein the results of the simulated customer interaction with the input journey predicts one or more predicted events based on the input journey.” (mental process, predicting an event can be performed in the human mind, or by a human using a pen and paper – see MPEP 2106.04(a)(2)(III)).
		Step 2A Prong 2: The additional elements recited in the claim do not integrate the judicial exception into a practical application as there are no additional elements.
		Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception as there are no additional elements.
	With respect to claim 6:
		Step 2A Prong 1: Inherits the mental process of claim 1 that this claim is dependent on.
		Step 2A Prong 2: The additional elements recited in the claim do not integrate the judicial exception into a practical application.
		Additional elements:
			“obtaining customer data for each customer of the plurality of journeys of the training data, wherein the customer data comprises demographic data” (adding insignificant extra-solution activity to the judicial exception – mere data gathering, see MPEP 2106.05(g));
			“further training the machine learning model to simulate customer interaction with the input journey using the customer data.” (adding insignificant extra-solution activity to the judicial exception – see Recentive v. Fox).
		Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
		Additional elements:
			“obtaining customer data for each customer of the plurality of journeys of the training data, wherein the customer data comprises demographic data” (MPEP 2106.05(d)(II)(iv) indicates that merely storing and retrieving information in memory is a well-understood, routine, and conventional function when it is claimed in a merely generic manner (as it is in the present claim));
			“further training the machine learning model to simulate customer interaction with the input journey using the customer data.” (RECENTIVE ANALYTICS, INC., v. FOX CORP., FOX BROADCASTING COMPANY, LLC, FOX SPORTS PRODUCTIONS, LLC, indicates that training a machine learning model is a well-understood, routine, and conventional function (pages 8-9) when it is claimed in a generic manner (as it is in the present claim)).
	With respect to claim 7:
	Step 2A Prong 1: Inherits the mental process of claim 1 that this claim is dependent on.
		Step 2A Prong 2: The additional elements recited in the claim do not integrate the judicial exception into a practical application.
		Additional elements:
			“the training data further comprises data indicating a time of the customer interactions with each event in the sequence of events of each journey of the plurality of journeys” (adding insignificant extra-solution activity to the judicial exception – mere data gathering, see MPEP 2106.05(g));
			“further training the machine learning model to simulate customer interaction with the input journey using the time of the customer interactions.” (adding insignificant extra-solution activity to the judicial exception – see Recentive v. Fox).
		Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
		Additional elements:
			“the training data further comprises data indicating a time of the customer interactions with each event in the sequence of events of each journey of the plurality of journeys” (MPEP 2106.05(d)(II)(iv) indicates that merely storing and retrieving information in memory is a well-understood, routine, and conventional function when it is claimed in a merely generic manner (as it is in the present claim));
		“further training the machine learning model to simulate customer interaction with the input journey using the time of the customer interactions.” (RECENTIVE ANALYTICS, INC., v. FOX CORP., FOX BROADCASTING COMPANY, LLC, FOX SPORTS PRODUCTIONS, LLC, indicates that training a machine learning model is a well-understood, routine, and conventional function (pages 8-9) when it is claimed in a generic manner (as it is in the present claim)).
	With respect to claim 9:
		Step 2A Prong 1:
			“predicting, by a machine learning model, a predicted event based on the sequence of events of the journey;” (mental process, predicting an event can be performed in the human mind, or by a human using a pen and paper – see MPEP 2106.04(a)(2)(III))
		Step 2A Prong 2: The additional elements recited in the claim do not integrate the judicial exception into a practical application.
		Additional elements:
			“A non-transitory computer-readable medium storing executable instructions, which when executed by a processing device, cause the processing device to perform operations” (mere instructions to apply the exception using a generic computer component – see MPEP 2106.05(f));
			“obtaining a journey, the journey comprising a sequence of events;” (adding insignificant extra-solution activity to the judicial exception – mere data gathering, see MPEP 2106.05(g));
			“, by a machine learning model,” (mere instructions to apply the exception using a generic computer component – see MPEP 2106.05(f)); 
			“causing display of the predicted event;” (mere instructions to apply the exception using a generic computer component– see MPEP 2106.05(f)).
		Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
		Additional elements:
			“A non-transitory computer-readable medium storing executable instructions, which when executed by a processing device, cause the processing device to perform operations” (mere instructions to apply the exception using a generic computer component – see MPEP 2106.05(f));
			“obtaining a journey, the journey comprising a sequence of events;” (MPEP 2106.05(d)(II)(iv) indicates that merely storing and retrieving information in memory is a well-understood, routine, and conventional function when it is claimed in a merely generic manner (as it is in the present claim));
			“, by a machine learning model,” (mere instructions to apply the exception using a generic computer component – see MPEP 2106.05(f)); 
			“causing display of the predicted event;” (mere instructions to apply the exception using a generic computer component– see MPEP 2106.05(f)).
With respect to claim 15:
		Step 2A Prong 1:
			“predicting, by the machine learning model, a probability of customers triggering one or more events of the sequence of events of the journey; “(mental process, predicting a probability of customers triggering one or more events can be performed in the human mind, or by a human using a pen and paper – see MPEP 2106.04(a)(2)(III))
		Step 2A Prong 2: The additional elements recited in the claim do not integrate the judicial exception into a practical application.
		Additional elements:
			“a processor; and a non-transitory computer-readable medium having stored thereon instructions that when executed by the processor, cause the processor to perform operations” (mere instructions to apply the exception using a generic computer component – see MPEP 2106.05(f));
			“obtaining, by a machine learning model, a journey, the journey comprising a sequence of events” (adding insignificant extra-solution activity to the judicial exception – mere data gathering, see MPEP 2106.05(g));
			“, by a machine learning model,” (mere instructions to apply the exception using a generic computer component – see MPEP 2106.05(f)); 
			“causing, by the user interface component, display of the probability of customers triggering one or more events of the sequence of events of the journey.” (mere instructions to apply the exception using a generic computer component– see MPEP 2106.05(f)).
		Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
		Additional elements:
			“a processor; and a non-transitory computer-readable medium having stored thereon instructions that when executed by the processor, cause the processor to perform operations” (mere instructions to apply the exception using a generic computer component – see MPEP 2106.05(f));
			“obtaining, by a machine learning model, a journey, the journey comprising a sequence of events;” (MPEP 2106.05(d)(II)(iv) indicates that merely storing and retrieving information in memory is a well-understood, routine, and conventional function when it is claimed in a merely generic manner (as it is in the present claim));
			“, by a machine learning model,” (mere instructions to apply the exception using a generic computer component – see 2106.05(f)); 
			“causing, by the user interface component, display of the probability of customers triggering one or more events of the sequence of events of the journey.” (mere instructions to apply the exception using a generic computer component– see 2106.05(f)).
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-4, 6-7,9-10, and 15-16 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Zhao et al. (US-20240169396-A1), hereafter Zhao.
Regarding claim 1, Zhao teaches:
	A computer-implemented method comprising:  
	obtaining, by a model training engine, training data from a plurality of journeys, each journey comprising a sequence of events, wherein the training data from each journey indicates customer interaction with each event in the sequence of events of the journey (Paragraph [0051] “Information from the database 203 is supplied to a feature engineering function 204, which determines the user behaviors of interest relating to consumed linear entertainment programs… The output from the feature engineering function 204 is supplied to a periodic data aggregation function 205, which aggregates the user behaviors on a periodic (such as daily) basis…Aggregated data regarding the user's consumption of linear entertainment programming can be forwarded from the periodic data aggregation function 205 to a data extract, transform, and load (ETL) function 206, which operates to provide the aggregated user behavior information populating a behavior feature database 207. The behavior feature database 207 supplies user behavior data relating to linear entertainment programs to a training data generation function 208 as necessary for at least a determination of a Fibonacci confidence interval level of each user's interest in viewed programs and a behavior shift feature for each user relating to viewed programs.” The aggregated user behavior data is training data from a plurality of journeys, supplied by the database is the method of obtaining, each journey (aggregated user behavior data) comprising of a sequence of user behaviors of interest relating to consumed linear entertainment programs which are a type of event, and user’s interest indicate customer interaction);
	training, by the model training engine using the training data, a machine learning model to simulate customer interaction with an input journey comprising an input sequence of events; (Paragraph [0051-0052] “The behavior feature database 207 provides user behavior input including the Fibonacci confidence interval levels and behavior shift features to both a model training function 209 and a training data registration function 210. The model training function 209 trains a machine learning model and provides model data to a behavior model registration function 211 … An inference data generation function 214 is based on inputs regarding those features from the target feature engineering function 212, as well as the machine learning model defined by the behavior model registration function 211, and user behavior information from the training data registration function 210, and generates an inference 215.”  Simulating is considered to include generating an inference so a model that simulates customer interaction with an input journey comprising an input sequence of events includes an inference data generation function)
	simulating, by the trained machine learning model, customer interaction with the input journey; (Paragraph [0052] “…and generates an inference.” As above, simulating is considered to include generating an inference)
	and causing, by a user interface component, display of results of the simulated customer interaction with the input journey; (Paragraph [0036]” According to embodiments of this disclosure, an electronic device 101 is included in the network configuration 100. The electronic device 101 can include at least one of a bus 110, a processor 120, a memory 130, an input/output (I/O) interface 150, a display 160, a communication interface 170, or a sensor 180.” A display is a user interface component and can cause display of results.) 

Regarding claim 2, Zhao teaches:
	All of the material disclosed in claim 1, and additionally teaches 
	the results of the simulated customer interaction with the input journey predicts a probability of customers triggering one or more events of the input sequence of events of the input journey. (Paragraph [0082] “A sigmoid function 720 operates on outputs of the FC layers 702 and produces at least one of a label 721 (during training) or a tune-in probability score 722 (during inferencing)” The tune-in probability score is considered a probability of customers triggering one or more events).

Regarding claim 3, Zhao teaches:
	All of the material disclosed in claim 1, and additionally teaches the results of the simulated customer interaction with the input journey predicts one or more predicted events based on the input journey. (Paragraph [0058] “For tune-in prediction, there may be a need or desire to both (i) predict whether a user has any interest in a campaign program, and (ii) predict the user's interest level in the target program for the campaign.” Predicting an event includes predicting whether a user has any interest in a campaign).

	Regarding claim 4, Zhao teaches:
		All of the material disclosed in claim 1, and additionally teaches 
the machine learning model is a transformer-based machine learning model (Paragraph [0034] “A deep learning model structure may use a transformer structure and attention to map user data from the behavior shift feature space into a Fibonacci confidence interval level prediction.” A deep learning model is a machine learning model. Using a transformer structure means it is a transformer-based machine learning model).
and the customer interaction with the input journey is simulated based on positional embeddings determined for each event of the input sequence of events. (Paragraph [0069] “The data on which the transformer layers 309a-309d operate is influenced by a positional embedding 312.” The transformer layer is used in the machine learning model that is simulating the customer interaction with the input journey. Thus, the simulation is based on positional embeddings.)
	Regarding claim 6, Zhao teaches:
		All of the material disclosed in claim 1, and additionally teaches 
		obtaining customer data for each customer of the plurality of journeys of the training data, wherein the customer data comprises demographic data; 
		and further training the machine learning model to simulate customer interaction with the input journey using the customer data. (Paragraph [0071] “In some cases, the feature data used in the present disclosure includes the following information” and diagram below paragraph [0071]. Geographic data is considered part of using demographic data and feature data is stated above in that it is disclosed in Zhao paragraph [0051] – [0052] that feature data is used as part of the training to simulate customer interaction)

	Regarding claim 7, Zhao teaches:
		All of the material disclosed in claim 1, and additionally teaches 
		the training data further comprises data indicating a time of the customer interactions with each event in the sequence of events of each journey of the plurality of journeys;
		and further training the machine learning model to simulate customer interaction with the input journey using the time of the customer interactions.  (Paragraph [0071] “In some cases, the feature data used in the present disclosure includes the following information” and diagram below paragraph [0071]. Time of day indicates the time of customer interactions. As above, Zhao paragraph [0051] – [0052] states that feature data is used as part of the training to simulate customer interaction)

	Regarding claim 9, Zhao teaches:
		A non-transitory computer-readable medium storing executable instructions, which when executed by a processing device, cause the processing device to perform operations comprising: (Paragraph 10 “Moreover, various functions described below can be implemented or supported by one or more computer programs, each of which is formed from computer readable program code and embodied in a computer readable medium.”)
	obtaining a journey, the journey comprising a sequence of events; (Paragraph [0051] “Information from the database 203 is supplied to a feature engineering function 204, which determines the user behaviors of interest relating to consumed linear entertainment programs… The output from the feature engineering function 204 is supplied to a periodic data aggregation function 205, which aggregates the user behaviors on a periodic (such as daily) basis…Aggregated data regarding the user's consumption of linear entertainment programming can be forwarded from the periodic data aggregation function 205 to a data extract, transform, and load (ETL) function 206, which operates to provide the aggregated user behavior information populating a behavior feature database 207. The behavior feature database 207 supplies user behavior data relating to linear entertainment programs to a training data generation function 208 as necessary for at least a determination of a Fibonacci confidence interval level of each user's interest in viewed programs and a behavior shift feature for each user relating to viewed programs.” A day of aggregated user behavior data is a journey, supplied by the database the method of obtaining, the journey (day of aggregated user behavior data) comprising of a sequence of user behaviors of interest relating to consumed linear entertainment programs, which are a type of event);
	predicting, by a machine learning model, a predicted event based on the sequence of events of the journey; (Paragraph [0058] “For tune-in prediction, there may be a need or desire to both (i) predict whether a user has any interest in a campaign program, and (ii) predict the user's interest level in the target program for the campaign.” Predicting an event includes predicting whether a user has any interest in a campaign).
		and causing display of the predicted event. (Paragraph [0036]” According to embodiments of this disclosure, an electronic device 101 is included in the network configuration 100. The electronic device 101 can include at least one of a bus 110, a processor 120, a memory 130, an input/output (I/O) interface 150, a display 160, a communication interface 170, or a sensor 180.” A display can cause a display.)

	Regarding claim 10, Zhao teaches:
		All of the material disclosed in claim 9, and additionally teaches
		the machine learning model is a transformer-based machine learning model
 (Paragraph [0034] “A deep learning model structure may use a transformer structure and attention to map user data from the behavior shift feature space into a Fibonacci confidence interval level prediction.” A deep learning model is a machine learning model. Using a transformer structure means it is a transformer-based machine learning model).
		and the predicted event is predicted based on positional embeddings of each event in the sequence of events of the journey. (Paragraph [0069] “The data on which the transformer layers 309a-309d operate is influenced by a positional embedding 312.” The transformer layer is used in the machine learning model that is simulating the customer interaction with the input journey. Thus, the simulation is based on positional embeddings.)

Regarding claim 15, the embodiment teaches:
	A computing system comprising: a processor; and a non-transitory computer-readable medium having stored thereon instructions that when executed by the processor, cause the processor to perform operations including: (Paragraph [0036] “According to embodiments of this disclosure, an electronic device 101 is included in the network configuration 100. The electronic device 101 can include at least one of a bus 110, a processor 120, a memory 130, an input/output (I/O) interface 150, a display 160, a communication interface 170, or a sensor 180.” A processor is included as part of the configuration and it was disclosed above in paragraph [0010] that Zhao disclosed that a computer-readable medium is used.)
	obtaining, by a machine learning model, a journey, the journey comprising a sequence of events; (Paragraph [0051] “Information from the database 203 is supplied to a feature engineering function 204, which determines the user behaviors of interest relating to consumed linear entertainment programs… The output from the feature engineering function 204 is supplied to a periodic data aggregation function 205, which aggregates the user behaviors on a periodic (such as daily) basis…Aggregated data regarding the user's consumption of linear entertainment programming can be forwarded from the periodic data aggregation function 205 to a data extract, transform, and load (ETL) function 206, which operates to provide the aggregated user behavior information populating a behavior feature database 207. The behavior feature database 207 supplies user behavior data relating to linear entertainment programs to a training data generation function 208 as necessary for at least a determination of a Fibonacci confidence interval level of each user's interest in viewed programs and a behavior shift feature for each user relating to viewed programs.” A day of aggregated user behavior data is a journey, supplied by the database the method of obtaining, the journey (day of aggregated user behavior data) comprising of a sequence of user behaviors of interest relating to consumed linear entertainment programs which are a type of event);
	predicting, by the machine learning model, a probability of customers triggering one or more events of the sequence of events of the journey; (Paragraph [0082] “A sigmoid function 720 operates on outputs of the FC layers 702 and produces at least one of a label 721 (during training) or a tune-in probability score 722 (during inferencing)” Predicting a probability is considered to include producing tune-in probability).
	and causing, by the user interface component, display of the probability of customers triggering one or more events of the sequence of events of the journey. (Paragraph [0036]” According to embodiments of this disclosure, an electronic device 101 is included in the network configuration 100. The electronic device 101 can include at least one of a bus 110, a processor 120, a memory 130, an input/output (I/O) interface 150, a display 160, a communication interface 170, or a sensor 180.” A display can cause a display.)
	Regarding claim 16, Zhao teaches:
		All of the material disclosed in claim 15, and additionally teaches 
	the machine learning model is a transformer-based machine learning model (Paragraph [0034] “A deep learning model structure may use a transformer structure and attention to map user data from the behavior shift feature space into a Fibonacci confidence interval level prediction.”  A deep learning model is a machine learning model. Using a transformer structure means it is a transformer-based machine learning model).
	and the probability of customers triggering one or more events of the sequence of events of the journey is predicted based on positional embeddings of each event in the sequence of events of the journey. (Paragraph [0069] “The data on which the transformer layers 309a-309d operate is influenced by a positional embedding 312.” The transformer layer is used in the machine learning model that is simulating the customer interaction with the input journey. Thus, the simulation is based on positional embeddings.)

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 5,8, 11-14, and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over  Zhao, and further in view of Kang et al. (US 20220301183), hereafter Kang and Behavior Sequence Transformer for E-commerce Recommendation in Alibaba by Chen et al., hereafter Chen.

	Regarding claim 5, the current embodiment of Zhao teaches:
		All of the material disclosed in claim 1.  In addition, Zhao suggests an embedding layer is useful in recommendation systems to represent behavior changes ((Zhao)Paragraph [0034] “A behavior shift feature space can be designed that uses traditional behavior sequences (used in sequential machine learning structures to represent behavior changes as in many embedding layer-based recommendation systems)”). The claimed invention is a system that is heavily involved with predicting customer interaction, so could be considered a recommendation system.
	However, the current embodiment of Zhao does not explicitly teach 
		training an embedding layer of the machine learning model to generate an embedding for each event in the input sequence of events of the input journey or 
		training a transformer encoder layer of the machine learning model to generate a positional embedding for each embedding of each event in the input sequence of events of the input journey.
	Chen explicitly teaches the use of an embedding layer that sends the embeddings to the transformer layer ((Chen) Section 2.1 Embedding Layer “The first component is the embedding layer, which embeds all input features into a fixed-size low dimensional vectors”, Figure 1, Figure 1 shows the embedding layer sending embeddings to the transformer layer.) and training the model ((Chen) Section 2.3 “To train the model,”).
	Zhao and Chen are analogous art because they both deal with the use of neural networks to predict or simulate people.
	Therefore, it would be obvious to a person of ordinary skill in the art to implement the embedding layer explicitly taught by Chen into the current embodiment of Zhao based on the specification of Zhao in order to represent existing customer interactions.
	Zhao in view of Chen still does not explicitly teach:
		 a transformer encoder layer of the machine learning model to generate a positional embedding for each embedding of each event in the input sequence of events of the input journey.
Kang teaches:
		Encoding a position of each object based on a transformer encoder layer (TransFormer encoder network)((Kang)Paragraph [0041] “encoding a center point position of each target object based on a TransFormer encoder network, to obtain a center point encoding feature of each target object”) and that positional embedding (position encoding) is used to solve the problem of transformers inability to obtain position information of a sequence by implicit learning ((Kang) Paragraph [0044] “In principle, by the Transformer, position information of a sequence cannot be obtained by implicit learning. In order to process a sequence problem, in the Transformer, position encoding (Position Encode/Embedding, PE) is used to solve this problem”).
	Zhao teaches a transformer encoder layer (transformer layer) learns a deeper representation for each event in the input sequence of events of the input journey (each behavior sequence) ((Zhao) Paragraph [0073] “One or more of the transformer layers can help to learn a deeper representation of each behavior sequence.”).
	Zhao, Chen, and Kang are analogous arts because all three arts deal in the same field of invention of utilizing transformer layers with positional embedding.
	Therefore, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply the teachings of Kang concerning the generation of positional encodings by a transformer encoder layer into the transformer encoder layer taught by Zhao. One having ordinary skill in the art would have been motivated to implement a transformer layer generating positional embedding based on in order to obtain position information in the sequence of customer interactions and their relationships. This implementation combined with the previous improvement would result in the predictable result of claim 5. Therefore, claim 5 is obvious. 
Regarding claim 8, the current embodiment of Zhao teaches:
	All the material disclosed in claim 1, and additionally teaches: 
		simulating the customer interaction with the input journey ((Zhao) Paragraph [0052] “… the trained machine learning model will operate to generate an inference… The inference data generation function 214 exploits the Fibonacci confidence interval levels and the user behavior shift features in the course of predicting user interest in the campaign program.” Simulating a customer interaction is considered to include generating an inference) based at least on the positional embedding for each embedding of each event in the input sequence of events of the input journey ((Zhao) Paragraph [0073] “The machine learning model can take features from the behavior shift feature space and add a positional embedding for each behavior sequence” The behavior shift features used to simulate a customer interaction above can have a positional embedding added on to them, which would make the simulation based on the positional embeddings.).
	Zhao further suggests an embedding layer is useful in recommendation systems to represent behavior changes ((Zhao)Paragraph [0034] “A behavior shift feature space can be designed that uses traditional behavior sequences (used in sequential machine learning structures to represent behavior changes as in many embedding layer-based recommendation systems)”). The claimed invention is a system that is heavily involved with predicting customer interaction, so could be considered a recommendation system.
	However, the current embodiment of Zhao does not explicitly teach 
		generating, by an embedding layer, an embedding for each event in the input sequence of events of the input journey or 
		generating, by a transformer encoder layer, a positional embedding for each embedding of each event in the input sequence of events of the input journey.
	Chen explicitly teaches the use of an embedding layer that sends the embeddings to the transformer layer ((Chen) Section 2.1 Embedding Layer “The first component is the embedding layer, which embeds all input features into a fixed-size low dimensional vectors”, Figure 1, Figure 1 shows the embedding layer sending embeddings to the transformer layer).
	Zhao and Chen are analogous art because they both deal with the use of neural networks to predict or simulate people.

	Therefore, it would be obvious to a person of ordinary skill in the art to implement the embedding layer explicitly taught by Chen into the current embodiment of Zhao based on the specification of Zhao in order to represent existing customer interactions.
	Zhao in view of Chen still does not explicitly teach:
			 generating, by a transformer encoder layer, a positional embedding for each embedding of each event in the input sequence of events of the input journey.
Kang teaches:
		Encoding a position of each object based on a transformer encoder layer (TransFormer encoder network)((Kang)Paragraph [0041] “encoding a center point position of each target object based on a TransFormer encoder network, to obtain a center point encoding feature of each target object”) and that positional embedding (position encoding) is used to solve the problem of transformers inability to obtain position information of a sequence by implicit learning ((Kang) Paragraph [0044] “In principle, by the Transformer, position information of a sequence cannot be obtained by implicit learning. In order to process a sequence problem, in the Transformer, position encoding (Position Encode/Embedding, PE) is used to solve this problem”).
	Zhao teaches a transformer encoder layer (transformer layer) learns a deeper representation for each event in the input sequence of events of the input journey (each behavior sequence) ((Zhao) Paragraph [0073] “One or more of the transformer layers can help to learn a deeper representation of each behavior sequence.”).
	Zhao, Chen, and Kang are analogous arts because all three arts deal in the same field of invention of utilizing transformer layers with positional embedding.
	Therefore, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply the teachings of Kang concerning the generation of positional encodings by a transformer encoder layer into the transformer encoder layer taught by Zhao. One having ordinary skill in the art would have been motivated to implement a transformer layer generating positional embedding based on in order to obtain position information in the sequence of customer interactions and their relationships. This implementation combined with the previous improvement would result in the predictable result of claim 8. Therefore, claim 8 is obvious. 

Regarding claim 11, the current embodiment of Zhao teaches:
	All of the material disclosed in claim 9, and additionally teaches 
		obtaining training data from a plurality of journeys, each journey of the plurality of journeys comprising a sequence of events, wherein the training data indicates customer interactions with each event in the sequence of events of each journey of the plurality of journeys (Paragraph [0051] “Information from the database 203 is supplied to a feature engineering function 204, which determines the user behaviors of interest relating to consumed linear entertainment programs… The output from the feature engineering function 204 is supplied to a periodic data aggregation function 205, which aggregates the user behaviors on a periodic (such as daily) basis…Aggregated data regarding the user's consumption of linear entertainment programming can be forwarded from the periodic data aggregation function 205 to a data extract, transform, and load (ETL) function 206, which operates to provide the aggregated user behavior information populating a behavior feature database 207. The behavior feature database 207 supplies user behavior data relating to linear entertainment programs to a training data generation function 208 as necessary for at least a determination of a Fibonacci confidence interval level of each user's interest in viewed programs and a behavior shift feature for each user relating to viewed programs.” The aggregated user behavior data is training data from a plurality of journeys, supplied by the database is the method of obtaining, each journey (aggregated user behavior data) comprising of a sequence of user behaviors of interest relating to consumed linear entertainment programs, which are a type of event, and user’s interest indicate customer interaction);
		and training, using the training data, the machine learning model to predict the predicted event based on the sequence of events of the journey (Paragraph [0051-0052] “The behavior feature database 207 provides user behavior input including the Fibonacci confidence interval levels and behavior shift features to both a model training function 209 and a training data registration function 210. The model training function 209 trains a machine learning model and provides model data to a behavior model registration function 211 … An inference data generation function 214 is based on inputs regarding those features from the target feature engineering function 212, as well as the machine learning model defined by the behavior model registration function 211, and user behavior information from the training data registration function 210, and generates an inference 215.”  Predicting a predicted event is considered to include generating an inference so a model that predicts an event is considered to include an inference data generation function). 
	Zhao further suggests an embedding layer is useful in recommendation systems to represent behavior changes ((Zhao)Paragraph [0034] “A behavior shift feature space can be designed that uses traditional behavior sequences (used in sequential machine learning structures to represent behavior changes as in many embedding layer-based recommendation systems)”). The claimed invention is a system that is heavily involved with predicting customer interaction, so could be considered a recommendation system.
	The current embodiment of Zhao does not explicitly teach 
		training an embedding layer of the machine learning model to generate an embedding for each event of the sequence of events of the journey or 
		training a transformer encoder layer of the machine learning model to generate a positional embedding for each embedding of each event of the sequence of events of the journey.
	Chen explicitly teaches the use of an embedding layer that sends the embeddings to the transformer layer ((Chen) Section 2.1 Embedding Layer “The first component is the embedding layer, which embeds all input features into a fixed-size low dimensional vectors”, Figure 1, Figure 1 shows the embedding layer sending embeddings to the transformer layer).
	Zhao and Chen are analogous art because they both deal with the use of neural networks to predict or simulate people.
	Therefore, it would be obvious to a person of ordinary skill in the art to implement the embedding layer explicitly taught by Chen into the current embodiment of Zhao based on the specification of Zhao in order to represent existing customer interactions.
	Zhao in view of Chen still does not explicitly teach:
		training a transformer encoder layer of the machine learning model to generate a positional embedding for each embedding of each event of the sequence of events of the journey.
Kang teaches:
		Encoding a position of each object based on a transformer encoder layer (TransFormer encoder network)((Kang)Paragraph [0041] “encoding a center point position of each target object based on a TransFormer encoder network, to obtain a center point encoding feature of each target object”) and that positional embedding (position encoding) is used to solve the problem of transformers inability to obtain position information of a sequence by implicit learning ((Kang) Paragraph [0044] “In principle, by the Transformer, position information of a sequence cannot be obtained by implicit learning. In order to process a sequence problem, in the Transformer, position encoding (Position Encode/Embedding, PE) is used to solve this problem”).
	Zhao teaches a transformer encoder layer (transformer layer) learns a deeper representation for each event in the input sequence of events of the input journey (each behavior sequence) ((Zhao) Paragraph [0073] “One or more of the transformer layers can help to learn a deeper representation of each behavior sequence.”).
	Zhao, Chen, and Kang are analogous arts because all three arts deal in the same field of invention of utilizing transformer layers with positional embedding.
	Therefore, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply the teachings of Kang concerning the generation of positional encodings by a transformer encoder layer into the transformer encoder layer taught by Zhao based on the teachings of Chen. One having ordinary skill in the art would have been motivated to implement a transformer layer generating positional embedding based on in order to obtain position information in the sequence of customer interactions and their relationships. This implementation combined with the previous improvement would result in the predictable result of claim 11. Therefore, claim 11 is obvious. 

Regarding claim 12, Zhao in view of Chen and Kang teaches:
	All of the material disclosed in claim 11, and Zhao additionally teaches 
		obtaining customer data for each customer of the plurality of journeys of the training data, wherein the customer data comprises demographic data;
			and further training the machine learning model to predict the predicted event using the customer data. ((Zhao) Paragraph [0071] “In some cases, the feature data used in the present disclosure includes the following information” and diagram below paragraph [0071]. Geographic data is considered part of using demographic data and feature data is stated above in that it is disclosed in Zhao paragraph [0051]- [0052] that feature data is used as part of the training to simulate customer interaction)
Regarding claim 13, Zhao in view of Kang and Chen teaches:
		All of the material disclosed in claim 11, and Zhao additionally teaches 
			the training data further comprises data indicating a time of the customer interactions with each event in the sequence of events of each journey of the plurality of journeys;
			and further training the machine learning model to predict the predicted event using the time of the customer interactions. ((Zhao) Paragraph [0071] “In some cases, the feature data used in the present disclosure includes the following information” and diagram below paragraph [0071]. Time of day indicates the time of customer interactions. As above, Zhao paragraph [0051]- [0052] states that feature data is used as part of the training to simulate customer interaction)
Regarding claim 14, the current embodiment of Zhao teaches:
	All the material disclosed in claim 9, and additionally teaches 
		predicting the predicted event ((Zhao) Paragraph [0058] “For tune-in prediction, there may be a need or desire to both (i) predict whether a user has any interest in a campaign program, and (ii) predict the user's interest level in the target program for the campaign.” Predicting an event in considered to include predicting whether a user has any interest in a campaign.) based at least on the positional embedding for each embedding of each event in the sequence of events of the journey ((Zhao) Paragraph [0073] “The machine learning model can take features from the behavior shift feature space and add a positional embedding for each behavior sequence” The behavior shift features used to simulate a customer interaction above can have a positional embedding added on to them, which would make the simulation based on the positional embeddings.).
	Zhao further suggests an embedding layer is useful in recommendation systems to represent behavior changes ((Zhao)Paragraph [0034] “A behavior shift feature space can be designed that uses traditional behavior sequences (used in sequential machine learning structures to represent behavior changes as in many embedding layer-based recommendation systems)”). The claimed invention is a system that is heavily involved with predicting customer interaction, so could be considered a recommendation system. Furthermore, Chen explicitly teaches the use of an embedding layer that sends the embeddings to the transformer layer ((Chen) Section 2.1 Embedding Layer “The first component is the embedding layer, which embeds all input features into a fixed-size low dimensional vectors”, Figure 1, Figure 1 shows the embedding layer sending embeddings to the transformer layer).
		The current embodiment of Zhao does not explicitly teach 
		generating, by an embedding layer, an embedding for each event in the sequence of events of the journey or 
		generating, by a transformer encoder layer, a positional embedding for each embedding of each event in the sequence of events of the journey.
	Zhao and Chen are analogous art because they both deal with the use of neural networks to predict or simulate people.
	Therefore, it would be obvious to a person of ordinary skill in the art to implement the embedding layer explicitly taught by Chen into the current embodiment of Zhao based on the specification of Zhao in order to represent existing customer interactions.
	Zhao in view of Chen still does not explicitly teach:
		 generating, by a transformer encoder layer, a positional embedding for each embedding of each event in the sequence of events of the journey.
		Kang teaches:
		Encoding a position of each object based on a transformer encoder layer (TransFormer encoder network)((Kang)Paragraph [0041] “encoding a center point position of each target object based on a TransFormer encoder network, to obtain a center point encoding feature of each target object”) and that positional embedding (position encoding) is used to solve the problem of transformers inability to obtain position information of a sequence by implicit learning ((Kang) Paragraph [0044] “In principle, by the Transformer, position information of a sequence cannot be obtained by implicit learning. In order to process a sequence problem, in the Transformer, position encoding (Position Encode/Embedding, PE) is used to solve this problem”).
	Zhao teaches a transformer encoder layer (transformer layer) learns a deeper representation for each event in the input sequence of events of the input journey (each behavior sequence) ((Zhao) Paragraph [0073] “One or more of the transformer layers can help to learn a deeper representation of each behavior sequence.”).
	Zhao, Chen, and Kang are analogous arts because all three arts deal in the same field of invention of utilizing transformer layers with positional embedding.
	Therefore, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply the teachings of Kang concerning the generation of positional encodings by a transformer encoder layer into the transformer encoder layer taught by Zhao based on the teachings of Chen. One having ordinary skill in the art would have been motivated to implement a transformer layer generating positional embedding based on in order to obtain position information in the sequence of customer interactions and their relationships. This implementation combined with the previous improvement would result in the predictable result of claim 14. Therefore, claim 14 is obvious.

Regarding claim 17, Zhao teaches:
	All of the material disclosed in claim 15, and additionally teaches 
			obtaining, by a model training engine, training data from a plurality of journeys, each journey of the plurality of journeys comprising a sequence of events, wherein the training data indicates customer interactions with each event in the sequence of events of each journey of the plurality of journeys; (Paragraph [0051] “Information from the database 203 is supplied to a feature engineering function 204, which determines the user behaviors of interest relating to consumed linear entertainment programs… The output from the feature engineering function 204 is supplied to a periodic data aggregation function 205, which aggregates the user behaviors on a periodic (such as daily) basis…Aggregated data regarding the user's consumption of linear entertainment programming can be forwarded from the periodic data aggregation function 205 to a data extract, transform, and load (ETL) function 206, which operates to provide the aggregated user behavior information populating a behavior feature database 207. The behavior feature database 207 supplies user behavior data relating to linear entertainment programs to a training data generation function 208 as necessary for at least a determination of a Fibonacci confidence interval level of each user's interest in viewed programs and a behavior shift feature for each user relating to viewed programs.” The aggregated user behavior data is training data from a plurality of journeys, supplied by the database is the method of obtaining, each journey (aggregated user behavior data) comprising of a sequence of user behaviors of interest relating to consumed linear entertainment programs which are a type of event, and user’s interest indicate customer interaction);
		and training, by the model training engine using the training data, the machine learning model to predict the probability of customers triggering one or more events of the sequence of events of the journey (Paragraph [0051-0052] “The behavior feature database 207 provides user behavior input including the Fibonacci confidence interval levels and behavior shift features to both a model training function 209 and a training data registration function 210. The model training function 209 trains a machine learning model and provides model data to a behavior model registration function 211 … An inference data generation function 214 is based on inputs regarding those features from the target feature engineering function 212, as well as the machine learning model defined by the behavior model registration function 211, and user behavior information from the training data registration function 210, and generates an inference 215”, Paragraph [0082] “A sigmoid function 720 operates on outputs of the FC layers 702 and produces at least one of a label 721 (during training) or a tune-in probability score 722 (during inferencing)” Generating an inference is considered to include producing a tune-in probability score. Predicting the probability of customers triggering one or more events of the sequence of events of the journey is considered to include a tune-in probability. Thus, predicting the probability of customers triggering one or more events of the sequence of events of the journey is considered to include generating an inference).
	Zhao further suggests an embedding layer is useful in recommendation systems to represent behavior changes ((Zhao)Paragraph [0034] “A behavior shift feature space can be designed that uses traditional behavior sequences (used in sequential machine learning structures to represent behavior changes as in many embedding layer-based recommendation systems)”). The claimed invention is a system that is heavily involved with predicting customer interaction, so could be considered a recommendation system. 
	The current embodiment of Zhao does not explicitly teach 
		training an embedding layer of the machine learning model to generate an embedding for each event of the sequence of events of the journey or 
		training a transformer encoder layer of the machine learning model to generate a positional embedding for each embedding of each event of the sequence of events of the journey.	
	Zhao and Chen are analogous art because they both deal with the use of neural networks to predict or simulate people.
	Chen explicitly teaches the use of an embedding layer that sends the embeddings to the transformer layer ((Chen) Section 2.1 Embedding Layer “The first component is the embedding layer, which embeds all input features into a fixed-size low dimensional vectors”, Figure 1, Figure 1 shows the embedding layer sending embeddings to the transformer layer.) and training the model ((Chen) Section 2.3 “To train the model,”).
	Therefore, it would be obvious to a person of ordinary skill in the art to implement the embedding layer explicitly taught by Chen into the current embodiment of Zhao based on the specification of Zhao in order to represent existing customer interactions.
	Zhao in view of Chen still does not teach:
		 training a transformer encoder layer of the machine learning model to generate a positional embedding for each embedding of each event of the sequence of events of the journey.
Kang teaches:
		Encoding a position of each object based on a transformer encoder layer (TransFormer encoder network)((Kang)Paragraph [0041] “encoding a center point position of each target object based on a TransFormer encoder network, to obtain a center point encoding feature of each target object”) and that positional embedding (position encoding) is used to solve the problem of transformers inability to obtain position information of a sequence by implicit learning ((Kang) Paragraph [0044] “In principle, by the Transformer, position information of a sequence cannot be obtained by implicit learning. In order to process a sequence problem, in the Transformer, position encoding (Position Encode/Embedding, PE) is used to solve this problem”).
	Zhao teaches a transformer encoder layer (transformer layer) learns a deeper representation for each event in the input sequence of events of the input journey (each behavior sequence) ((Zhao) Paragraph [0073] “One or more of the transformer layers can help to learn a deeper representation of each behavior sequence.”).
	Zhao, Chen, and Kang are analogous arts because all three arts deal in the same field of invention of utilizing transformer layers with positional embedding.
	Therefore, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply the teachings of Kang concerning the generation of positional encodings by a transformer encoder layer into the transformer encoder layer taught by Zhao based on the teachings of Chen. One having ordinary skill in the art would have been motivated to implement a transformer layer generating positional embedding based on in order to obtain position information in the sequence of customer interactions and their relationships. This implementation combined with the previous improvement would result in the predictable result of claim 17. Therefore, claim 17 is obvious.

Regarding claim 18, Zhao in view of Kang and Chen teaches:
	All of the material disclosed in claim 17, and Zhao additionally teaches:
		obtaining customer data for each customer of the plurality of journeys of the training data, wherein the customer data comprises demographic data;
			and further training the machine learning model to predict the probability of customers triggering one or more events of the sequence of events of the journey using the customer data. ((Zhao) Paragraph [0071] “In some cases, the feature data used in the present disclosure includes the following information” and diagram below paragraph [0071]. Geographic data is considered part of using demographic data and feature data is stated above in that it is disclosed in Zhao paragraph [0051] - [0052] that feature data is used as part of the training to simulate customer interaction)
Regarding claim 19, Zhao in view of Kang and Chen teaches:
		All of the material disclosed in claim 17, and Zhao additionally teaches 
			the training data further comprises data indicating a time of the customer interactions with each event in the sequence of events of each journey of the plurality of journeys;
			and further training the machine learning model to predict the probability of customers triggering one or more events of the sequence of events of the journey using the time of the customer interactions. ((Zhao) Paragraph [0071] “In some cases, the feature data used in the present disclosure includes the following information” and diagram below paragraph [0071]. Time of day indicates the time of customer interactions. As above, Zhao paragraph 51-52 states that feature data is used as part of the training to simulate customer interaction)
Regarding claim 20, the current embodiment of Zhao teaches:
	All the material disclosed in claim 15, and additionally teaches 
		predicting the probability of customers triggering one or more events of the sequence of events of the journey (Paragraph [0082] “A sigmoid function 720 operates on outputs of the FC layers 702 and produces at least one of a label 721 (during training) or a tune-in probability score 722 (during inferencing)” Predicting the probability of customer triggering one or more events is considered to include a tune-in probability score.) based at least on the positional embedding for each embedding of each event in the sequence of events of the journey ((Zhao) Paragraph [0073] “The machine learning model can take features from the behavior shift feature space and add a positional embedding for each behavior sequence” The behavior shift features used to simulate a customer interaction above can have a positional embedding added on to them, which would make the simulation based on the positional embeddings).
	Zhao further suggests an embedding layer is useful in recommendation systems to represent behavior changes ((Zhao)Paragraph [0034] “A behavior shift feature space can be designed that uses traditional behavior sequences (used in sequential machine learning structures to represent behavior changes as in many embedding layer-based recommendation systems)”). The claimed invention is a system that is heavily involved with predicting customer interaction, so could be considered a recommendation system.
	The current embodiment of Zhao does not explicitly teach 
		generating, by an embedding layer, an embedding for each event in the sequence of events of the journey or 
		generating, by a transformer encoder layer, a positional embedding for each embedding of each event in the sequence of events of the journey.
	
	Chen explicitly teaches the use of an embedding layer that sends the embeddings to the transformer layer ((Chen) Section 2.1 Embedding Layer “The first component is the embedding layer, which embeds all input features into a fixed-size low dimensional vectors”, Figure 1, Figure 1 shows the embedding layer sending embeddings to the transformer layer). 
	Zhao and Chen are analogous art because they both deal with the use of neural networks to predict or simulate people.
	Therefore, it would be obvious to a person of ordinary skill in the art to implement the embedding layer explicitly taught by Chen into the current embodiment of Zhao based on the specification of Zhao in order to represent existing customer interactions.
	Zhao in view of Chen still does not explicitly teach:
		 generating, by a transformer encoder layer, a positional embedding for each embedding of each event in the sequence of events of the journey;
		Kang teaches:
		Encoding a position of each object based on a transformer encoder layer (TransFormer encoder network)((Kang)Paragraph [0041] “encoding a center point position of each target object based on a TransFormer encoder network, to obtain a center point encoding feature of each target object”) and that positional embedding (position encoding) is used to solve the problem of transformers inability to obtain position information of a sequence by implicit learning ((Kang) Paragraph [0044] “In principle, by the Transformer, position information of a sequence cannot be obtained by implicit learning. In order to process a sequence problem, in the Transformer, position encoding (Position Encode/Embedding, PE) is used to solve this problem”).
	Zhao teaches a transformer encoder layer (transformer layer) learns a deeper representation for each event in the input sequence of events of the input journey (each behavior sequence) ((Zhao) Paragraph [0073] “One or more of the transformer layers can help to learn a deeper representation of each behavior sequence.”).
	Zhao, Chen, and Kang are analogous arts because all three arts deal in the same field of invention of utilizing transformer layers with positional embedding.
	Therefore, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply the teachings of Kang concerning the generation of positional encodings by a transformer encoder layer into the transformer encoder layer taught by Zhao based on the teachings of Chen. One having ordinary skill in the art would have been motivated to implement a transformer layer generating positional embedding based on in order to obtain position information in the sequence of customer interactions and their relationships. This implementation combined with the previous improvement would result in the predictable result of claim 20. Therefore, claim 20 is obvious.

Double Patenting
Claims 9 and 10 of this application are patentably indistinct from claim 10 of Application No. 18/455005 as of 2/25/2026. Claims 15 and 16 of this application are patentably indistinct from claim 17 of Application No. 18/455005 as of 2/25/2026. Pursuant to 37 CFR 1.78(f), when two or more applications filed by the same applicant or assignee contain patentably indistinct claims, elimination of such claims from all but one application may be required in the absence of good and sufficient reason for their retention during pendency in more than one application. Applicant is required to either cancel the patentably indistinct claims from all but one application or maintain a clear line of demarcation between the applications. See MPEP § 822.
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
Claims 9 and 10 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claim 10 of copending Application No. 18/455005 (cited in IDS) in view of Behavior sequence transformer for e-commerce recommendation in Alibaba (Chen et al.), hereafter Chen. 
Regarding claim 9 of the instant application:
Claim 10 of Application No. 18/455005 teaches all of the material disclosed in claim 9 of the instant application as shown in the table below, where the left side contains the elements of claim 9 of the instant application and the right side contains the elements of claim 10 of Application No. 18/455005:

Instant Application
Application No. 18455005
9. A non-transitory computer-readable medium storing executable instructions, which when executed by a processing device, cause the processing device to perform operations comprising:
10. A non-transitory computer readable medium storing code, the code comprising instructions executable by a processor to:
obtaining a journey, the journey comprising a sequence of events;
obtain a user journey including a plurality of touchpoints;
predicting, by a machine learning model, a predicted event based on the sequence of events of the journey;
perform a simulation of the user journey based on the probability score;
and causing display of the predicted event.
and generate a text describing the user journey based on the simulation.

Therefore, claim 9 of the instant application is anticipated.
Regarding claim 10 of the instant application:
Claim 10 of Application No. 18/455005 in view of Chen contains all of the material in claim 10 of the instant application. 
However, Claim 10 of Application No. 18/455005 does not contain a transformer-based machine learning model or that the customer interaction with the input journey is simulated based on positional embeddings determined for each event of the input sequence of events.
Chen teaches a transformer layer learns a deeper representation for each item by capturing the relations with other items ((Chen) Section 2.2 Transformer Layer) and positional embedding captures order information ((Chen) Section 2.1 Subheading Positional Embedding).
It would be obvious to a person of ordinary skill in the art to implement the transformer layer with positional embedding disclosed in Chen into the invention of claim 10 of Application No. 18/455005 in order to learn the order information and the relations between different parts of the user journey, which would result in the predictable result of claim 10 of the instant application. Therefore, claim 10 of the instant application is obvious.
Claims 15 and 16 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claim 10 of copending Application No. 18/455005 (cited in IDS) in view of Behavior sequence transformer for e-commerce recommendation in Alibaba (Chen et al.), hereafter Chen. 
Regarding claim 15 of the instant application:
Claim 10 of Application No. 18/455005 teaches all of the material disclosed in claim 15 of the instant application with the only difference being the statutory category (system containing a processor and a CRM run on the processor in the instant versus a CRM run by a processor in Application No. 18455005) as shown in the table below, where the left side contains the elements of claim 15 of the instant application and the right side contains the elements of claim 10 of Application No. 18/455005:





Instant Application
Application No. 18455005
15. A computing system comprising: a processor; and a non-transitory computer-readable medium having stored thereon instructions that when executed by the processor, cause the processor to perform operations including:
10. A non-transitory computer readable medium storing code, the code comprising instructions executable by a processor to:
obtaining, by a machine learning model, a journey, the journey comprising a sequence of events;
obtain a user journey including a plurality of touchpoints;
predicting, by the machine learning model, a probability of customers triggering one or more events of the sequence of events of the journey;
perform a simulation of the user journey based on the probability score;
and causing, by the user interface component, display of the probability of customers triggering one or more events of the sequence of events of the journey.
and generate a text describing the user journey based on the simulation.

Therefore, claim 15 of the instant application in anticipated.
Regarding claim 16 of the instant application:
Claim 10 of Application No. 18/455005 in view of Chen contains all of the material in claim 15 of the instant application with the sole difference of statutory category. However, it does not contain a transformer-based machine learning model or that the customer interaction with the input journey is simulated based on positional embeddings determined for each event of the input sequence of events.
Chen teaches a transformer layer learns a deeper representation for each item by capturing the relations with other items ((Chen) Section 2.2 Transformer Layer) and positional embedding captures order information ((Chen) Section 2.1 Subheading Positional Embedding).
It would be obvious to a person of ordinary skill in the art to implement the transformer layer with positional embedding disclosed in Chen into the invention of claim 10 of Application No. 18/455005 in order to learn the order information and the relations between different parts of the user journey, which would result in the predictable result of claim 16 of the instant application. Therefore, claim 10 of the instant application is obvious.

These are provisional nonstatutory double patenting rejections.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  Patents and/or related publications are cited in the Notice of References Cited (Form PTO-892) attached to this action to further show the state of the art with respect to transformers, simulation, and prediction.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DYLAN H LAI whose telephone number is (571)272-8628. The examiner can normally be reached Monday - Friday 7:30am-5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached at 5712524241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

DHL
Examiner
Art Unit 2144



/TAMARA T KYLE/Supervisory Patent Examiner, Art Unit 2144
Read full office action
TRANSFORMER MODEL FOR JOURNEY SIMULATION AND PREDICTION

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

TRANSFORMER MODEL FOR JOURNEY SIMULATION AND PREDICTION

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email