Office Action Analysis: 18196823 — TRAINING AND DEPLOYING MODELS TO PREDICT CYBERSECURITY EVENTS

Office Action

§102 §103 §112
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments filed December 18, 2025 have been fully considered but they are not persuasive.
In page 1 of the remarks, Applicant states that claims 1-20 were rejected in the previous Offce Action (“OA”), dated September 25, 2025, as failing to particularly point out and distinctly claim the subject matter regarded as the invention, under 35 U.S.C. § 112(b) (“112(b)”). Applicant does not concede the subject matter failing to point out and claim the subject matter under 112(b), and claim 1 was amended solely to expedite prosecution, in particular, removing “from a current time” and “relatively limited processing resources available” as described in the rejection under 112(b) of the independent claim 1 in the previous OA. Examiner states that as claim 1 has been amended the Examiner withdraws the rejections of 112(b).
In page 2 of the remarks, Applicant states that Independent claims 9 and 17 were also rejected under 112(b) for similar reasons as claim 1 as described above. As a result of the amendments reflecting the claims now recited in claim 1, Examiner withdraws the rejections of 112(b) of claims 9 and 17, with dependent claims 10-16, and 18-20 also having their rejections withdrawn as they depend on claims 9 and 17, respectively.

In pages 3-6 of the remarks, Applicant states that claims 1-20 were rejected in the previous OA, dated September 25, 2025, as failing to demonstrate possession of the claimed subject matter as of the original filing date, as mandated by the written description requirement, 35 U.S.C. § 112(a) (“112(a)”). Applicant disagrees with the rejections, but independent claim 1 has been amended to address the grounds of the rejections, including replacing “relatively limited processing resources available”, replaced with “private user information stored thereon”. Claim 5 was also rejected has also been amended, with its previous limitations moved to claim 1, and claim 5 now recites “wherein the labels identify key detection events for a base cybersecurity application as labels for events of interest in training”, with the previous rejection in the previous OA at NFOA, pages 10-11. In particular, the rejections alleged that the limitations of claim 5 are not sufficient “so that one of ordinary skill in the art would recognize that the applicant had possession of the claimed invention”. Without arguing the rejections specifically, Applicant has amended claims 1 and 5, and recites MPEP § 2164.01 in page 4 of the remarks, in particular, “Any analysis of whether a particular claim is supported by the disclosure in an application requires a determination of whether that disclosure, when filed, contained sufficient information regarding the subject matter of the claims as to enable one skilled in the pertinent art to make and use the claimed invention […] Mineral Separation v. Hyde, 242 U.S. 261, 270 (1916), namely: is the experimentation needed to practice the invention undue or unreasonable? The Federal Circuit dictated this standard is still determinative. See In re Wands, 858 F.2d 731, 737 (Fed. Cir. 1988)”, with Applicant stating that claim 5, with the previous limitations now in claim 1, the claimed invention can be made and used by a skilled artisan as embodied in claim 5 without “any undue experimentation based on (1) Applicant’s disclosures; and/or (2) knowledge generally available in the art”. Applicant states that the description in paragraphs [0068]-[0071] of the filed Specification includes substantial details that one of ordinary skill in the art would be able to recognize that Applicant had possession of the claimed invention. Furthermore, Applicant states that Examiner only used paragraphs [0076] and [0082] to determine that claims 1 and 5 are not necessarily sufficient “so that one of ordinary skill in the art would recognize that the applicant had possession of the claimed invention”. Applicant has also amended claims 1-8, including moving limitations from claim 5 previously presented to claim 1, clarifying that “predetermined time-bins and labels” are in “low-dimensional casual event streams”, and that pretraining of the second model is further described in claim 1, e.g. “wherein the pretraining comprises the second model classifying inputs as malicious or normal”.
Furthermore, in pages 4-6 of the remarks, Applicant states that Examiner misapplied the standard of the written description requirement to form the present rejection, with the NFOA stating, in pages 10-11, that claims 1-5 are “algorithm or steps/procedures for these claimed functions is not explained at all or not explained in sufficient detail […] so that one of ordinary skill in the art would recognize that the application had possession of the claimed invention”, with the pertinent standard is not whether the application conveys the exact claim limitation, but rather, whether the “disclosure, when filed, contained sufficient information regarding the subject matter of the claims as to enable one skilled in the pertinent art to make and use the claimed invention.”, MPEP § 2164.01, and the “test of enablement is whether one reasonably skilled in the art could make or use the invention from the disclosures in the patent coupled with information known in the art without undue experimentation”, not whether the claim itself includes all “steps/procedures for the claimed functions” as alleged in the rejections, NFOA, pages 10-11. A proper determination of whether any experimentation is undue requires an analysis of the eight Wands factors, MPEP § 2164.01(a), with Applicant stating that claims 1 and 5 are not explained in sufficient detail because they restate functions of the detailed description (NFOA, pages 10-11) falling short of establishing the prima facie case for an enablement rejection under 112(a), and requests withdrawal of the 112(a) rejections for failing to allege the prima facie requirements for a nonenablement stance.
Examiner disagrees with the Applicant regarding the withdrawal of the 112(a) rejections, as the Applicant states that although support for “in response to a determination that the likelihood exceeds a predetermined threshold, […] instructing a first of the host devices to remain offline for a predetermined period of time […]” and “wherein the first host device has relatively limited processing resources available for recovering from the malicious cybersecurity event” is described in the Specification of the Applicant in paragraph [0076], with Applicant also stating that support for these limitations can be found in paragraphs [0068]-[0071] at the bottom of page 4 of the remarks, the Applicant does not appear to clarify what passages in particular clear up the rejections of 112(a) for lack of written description, such as a device “remain[ing] offline for a predetermined period of time”, instead Applicant’s arguments generally pertain to enablement “any undue experimentation”, and stated in paragraph [0076] remaining vague in the filed Specification for how this limitation is performed, only that it is applicable for devices that have relatively limited processing resources for recovering from a malicious cybersecurity event, have private user information stored thereon, etc.. Examiner states that in MPEP § 2161.01, “computer-implemented functional claim language must still be evaluated for sufficient disclosure under the written description and enablement requirements of 35 U.S.C. 112(a)”, with paragraph I describing the written description requirement as “the specification must describe the claimed invention in sufficient detail that one skilled in the art can reasonably conclude that the inventor had possession of the claimed invention at the time of filing. Reiffin v. Microsoft Corp., 214 F.3d 1342, 1345, 54 USPQ2d 1915, 1917 (Fed. Cir. 2000) (“The purpose of [the written description requirement] is to ensure that the scope of the right to exclude, as set forth in the claims, does not overreach the scope of the inventor’s contribution to the field of art as described in the patent specification”); LizardTech Inc. v. Earth Resource Mapping Inc., 424 F.3d 1336, 1345, 76 USPQ2d 1724, 1732 (Fed. Cir. 2005) (“Whether the flaw in the specification is regarded as a failure to demonstrate that the patentee [inventor] possessed the full scope of the invention recited in [the claim] or a failure to enable the full breadth of that claim, the specification provides inadequate support for the claim under [§ 112(a)]”); cf. id. (“A claim will not be invalidated on [§] 112 grounds simply because the embodiments of the specification do not contain examples explicitly covering the full scope of the claim language.”).” Enablement is described in paragraph III, “Applicants who present broad claim language must ensure the claims are fully enabled. Specifically, the scope of the claims must be less than or equal to the scope of the enablement provided by the specification. Sitrick v. Dreamworks, LLC, 516 F.3d 993, 999, 85 USPQ2d 1826, 1830 (Fed. Cir. 2008) (“The scope of the claims must be less than or equal to the scope of the enablement to ensure that the public knowledge is enriched by the patent specification to a degree at least commensurate with the scope of the claims.” (quotation omitted))”, such that a person of ordinary skill would understand the claimed limitation can be performed in the context of the Specification.
Examiner states that the claimed limitations of “in response to a determination that the likelihood exceeds a predetermined threshold, […] instructing a first of the host devices to remain offline for a predetermined period of time […]” and “wherein the first host device has relatively limited processing resources available for recovering from the malicious cybersecurity event” can be understood by a person of ordinary skill as being enabled to be performed so that a mitigate any anticipated malicious cybersecurity events rather than merely respond to the events after the malicious events gained unauthorized access to the host devices as described in paragraph [0072]. However, the Specification does not describe in specific details how the host remains disconnected from the device, with no particular details provided, such as a disconnection from a network or otherwise how the operation of “remaining offline for a predetermined period of time” is performed after operation 208 is performed in Fig. 2. The limitation of “wherein the first host device has relatively limited processing resources available for recovering from the malicious cybersecurity event” has been removed from the independent claims, and replaced with “wherein the first host device has private user information stored thereon”, and while paragraph [0076] restates the claim limitation, and paragraph [0051] describing what malicious actors can perform when they gain unauthorized access, such as mining private information stored on the devices, it is not clarified what “private user information” is intended to represent, such as information not stored onto a cloud service, or merely sensitive information such as contacts, credit cards, and/or login information of a user. While a person of ordinary skill is capable of understanding these aspects of the claimed invention, it is not described in sufficient detail as to how to make and use the invention without undue experimentation, as these limitations are common techniques to mitigate malicious events, it is not described in sufficient detail how the invention performs these steps in specific details, as described above. As a result, the Examiner maintains the rejections under 112(a) for lack of written description for the claimed limitations of “in response to a determination that the likelihood exceeds a predetermined threshold, […] instructing a first of the host devices to remain offline for a predetermined period of time […]” and “wherein the first host device has relatively limited processing resources available for recovering from the malicious cybersecurity event”, as described in the independent claim 1, with dependent claims 2-8 also having their rejections maintained as they depend on claim 1.
In page 7 of the remarks, Applicant states that Independent claims 9 and 17 were also rejected under 112(a) for similar reasons as claim 1 as described above. As a result of the amendments reflecting the claims now recited in claim 1, Examiner maintains the rejections of 112(a) of claims 9 and 17, with dependent claims 10-16, and 18-20 also having their rejections maintained as they depend on claims 9 and 17, respectively.

In pages 8-9 of the remarks, Applicant states that claims 1, 4, 7, 9, 12, 15, 17, and 20 stand rejected under 35 U.S.C. section 102(a)(2) as being anticipated by Ackerman et al. (U.S. Pat. No. 12050715 B2), hereafter "Ackerman". Independent claim 1 has been amended to include the previous claim limitations of claim 5, along with “[…] first model formed by low-dimensional causal event streams with predetermined time-bins and labels”, “wherein the low-dimensional causal event streams include all recorded log events of the collected historical event log data, wherein the recorded log events are collected over a plurality of days”, “in response to a determination that the likelihood exceeds a predetermined threshold and a determination that a first of the host devices has been previously accessed by unauthorized malicious cybersecurity actors, instructing a first of the host devices to remain offline for a predetermined period of time and instructing at least one of the host devices to implement additional authentication techniques”, and “wherein the first host device has private user information stored thereon”, with support found in paragraphs [0068]-[0071], and paragraph [0082] of the filed Specification, with Applicant stating that none of the art of record suggesting the combination of features claimed. Applicant further states that Ackerman does not anticipate the training of the second model described in claim 1, such as the limitations included in claim 5, and the NFOA in pages 37-38, and also fails to identify a final CLS token, now in claim 1. Applicant further states that the result of the output of the class of the trees does not amount to an output vector from the CLS token after the token passed through all layers of a transformed model. Applicant requests withdrawal of the independent claim 1 as it was disclosed by Ackerman.
Examiner states that the Applicant does not argue in sufficient detail as to why the prior art of Ackerman does not recite the limitations of amended claim 1, with the limitations removed from claim 5 moved to claim 1, only that the prior art simply does not recite the limitations. To reiterate the claimed limitations described in the 102(a)(2) rejections under Ackerman, [Col. 32, lines 38-44] describes a passage where the second model is capable of creating a statistical model that represents a time period of events that is then transformed into a vector for the statistical model, [Col. 26, lines 59-64] stating that construct multitude of decisions trees at training time, outputting the class of the trees, this corresponds to generating a final classification (CLS) token as the class of the trees corresponds to a token that states the classification of the token embeddings of an event stream, and the class of the trees is also utilized to determine if an event vector 1110 should be classified as within the baseline activity characterized by entity model, as described in [Col. 34, lines 35-53] of Ackerman. Furthermore, the amended limitations of a first model being generated by the first model formed by low-dimensional causal event streams with predetermined time-bins and labels is further clarified in [Col. 31, lines 41-45] as events 1106 and event vectors 1110 can be labeled in several ways, including with process identifiers or include identification of an entity associated with the event 1106 or event vector 1110, which are defined as events that cause incidents to be logged in an event stream. And in [Col. 32, lines 38-44], each event is timestamped, which corresponds to pre-determined time-bins for events in an event stream. Amended claims that have been newly added with support in the Specification have been addressed in the claim rejections under 35 U.S.C. 102 section in pages 15-25 of the Office Action.
In page 10 of the remarks, Applicant states that Independent claims 9 and 17 were also rejected under 102(a)(2) for similar reasons as claim 1 as described above, as well as claims 12 and 15 under claim 9, and claim 20 under claim 17. As a result of the amendments reflecting the claims now recited in claim 1, Examiner maintains the rejections of 102(a)(2) of claims 9 and 17, with dependent claims 12 and 15, and 20 also having their rejections maintained as they depend on claims 9 and 17, respectively.

In page 11 of the remarks, Applicant states that dependent claims 2-3, 10-11, and 18-19 stand rejected under 35 U.S.C. 103 as being unpatentable over Ackerman in view of Lee et al. (U.S. Pub. No. 20220094713 A1), hereinafter Lee.
Claims 2-3 were rejected under 103, and Applicant states that claims 2-3 depend upon claim 1, and as a result, the rejections suffer from the same deficiencies as set forth in claim 1 above, as Lee has merely been added to show the limitations of the dependent claims, and believes claims 2-3 are allowable over the combination of Ackerman in view of Lee stated by the Examiner.
In page 11 of the remarks, Applicant states that dependent claims 10-11 and 18-19 were also rejected under 102(a)(2) for similar reasons as claims 2-3 as described above, as well as claims 10-11 under claim 9, and claims 18-19 under claim 17. As a result of the amendments reflecting the claims now recited in claims 2-3, Examiner maintains the rejections of 103 of claims 10-11 and 18-19, with dependent claims 10-11, and 18-19 also having their rejections maintained as they depend on claims 9 and 17, respectively.

In page 12 of the remarks, Applicant states that dependent claims 5-6, and 13-14 stand rejected under 35 U.S.C. 103 as being unpatentable over Ackerman in view of Gopalakrishnan et al. (U.S. Pub. No. 20230134546 A1), hereinafter Gopalakrishnan.
Claims 5-6 were rejected under 103, and Applicant states that claims 5-6 depend upon claim 1, and as a result, the rejections suffer from the same deficiencies as set forth in claim 1 above, as Lee has merely been added to show the limitations of the dependent claims, and believes claims 5-6 are allowable over the combination of Ackerman in view of Gopalakrishnan stated by the Examiner.
In page 12 of the remarks, Applicant states that dependent claims 13-14 were also rejected under 102(a)(2) for similar reasons as claims 5-6 as described above, as well as claims 13-14 under claim 9. As a result of the amendments reflecting the claims now recited in claims 5-6, Examiner maintains the rejections of 103 of claims 13-14, with dependent claims 13-14 also having their rejections maintained as they depend on claim 9, respectively.

In page 13 of the remarks, Applicant states that dependent claims 8, and 16 stand rejected under 35 U.S.C. 103 as being unpatentable over Ackerman in view of Pratt et al. (U.S. Pat. No. 10673880 B2), hereinafter Pratt.
Claim 8 was rejected under 103, and Applicant states that claim 8 depends upon claim 1, and as a result, the rejections suffer from the same deficiencies as set forth in claim 1 above, as Lee has merely been added to show the limitations of the dependent claims, and believes claim 8 is allowable over the combination of Ackerman in view of Pratt stated by the Examiner.
In page 12 of the remarks, Applicant states that dependent claim 16 were also rejected under 102(a)(2) for similar reasons as claim 8 as described above, as well as claim 16 under claim 9. As a result of the amendments reflecting the claims now recited in claim 8, Examiner maintains the rejections of 103 of claim 16, with dependent claim 16 also having their rejections maintained as they depend on claim 9, respectively.

Claim Rejections - 35 USC § 112(a)
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1-20 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Claim 1 has the limitations “in response to a determination that the likelihood exceeds a predetermined threshold and a determination that a first of the host devices has been previously accessed by unauthorized malicious cybersecurity actors, instructing a first of the host devices to remain offline for a predetermined period of time, and instructing at least one of the host devices to implement additional authentication techniques” and is described in the Specification of the Applicant in paragraph [0076], and “wherein the first host device has private user information stored thereon“ described in the Specification of the Applicant in paragraph [0076]. However, no other information is provided as to how the limitation is performed. For instance, paragraph [0051] describing what malicious actors can perform when they gain unauthorized access, such as mining private information stored on the devices, it is not clarified what “private user information” is intended to represent, such as information not stored onto a cloud service, or merely sensitive information such as contacts, credit cards, and/or login information of a user. The algorithm or steps/procedures for these claimed functions is not explained at all or is not explained in sufficient detail (simply restating the function reciting in the claim is not necessarily sufficient) so that one of ordinary skill in the art would recognize that the applicant had possession of the claimed invention.
Claim 1 also contains the limitations of ‘generating a time-series matrix that is based on the embedding vectors generated by the first model for a period of a recent logged event stream’, ‘performing a convolution and projection of the time-series matrix for the period of the recent logged event stream to generate time-token embeddings for the period of the recent logged event stream’, ‘applying the time-token embeddings for the period of the recent logged event stream to a hierarchical attention graph to generate a final classification (CLS) token embedding’, and ‘using the final CLS token embedding to train the second model’ described in the Specification of the Applicant in paragraph [0082], in which the claim limitations are also repeated in Fig. 4C of the Specification of the Applicant. However, no other information is provided as to how the limitation is performed. The algorithm or steps/procedures for these claimed functions is not explained at all or is not explained in sufficient detail (simply restating the function reciting in the claim is not necessarily sufficient) so that one of ordinary skill in the art would recognize that the applicant had possession of the claimed invention.
Dependent claims 2-8 are dependent on independent claim 1, and as a result of independent claim 1 being rejected under 112(a) for lacking written description, the dependent claims are also rejected under 112(a) for being dependent upon independent claim 1.

Claim 4 is rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
The claimed limitation of “causing the second model to estimate whether events associated with the training targets will occur within a third predetermined period of time” is amended in claim 4, with the “third predetermined period of time” replacing the recitation of “second predetermined period of time”. Applicant has not pointed out where the new (or amended) claim is supported, nor does there appear to be a written description of the claim limitation “[…] third predetermined period of time” in the application as filed, as while the “second predetermined period of time” is recited in paragraph [0071] of the Specification, the term “third predetermined period of time” is not recited in the Specification of the invention anywhere.

Claim 6 has the limitation of ‘outputs of the second model as a result of the use of the final CLS token embedding to train the second model to classify input includes classifications of input as being malicious or normal’ which is described in the Specification of the Applicant in paragraph [0082]. However, no other information is provided as to how the limitation is performed. The algorithm or steps/procedures for these claimed functions is not explained at all or is not explained in sufficient detail (simply restating the function reciting in the claim is not necessarily sufficient) so that one of ordinary skill in the art would recognize that the applicant had possession of the claimed invention.
Dependent claims 14 share similar limitations as claim 6 above, and as a result as a result of claim 6 being rejected under 112(a) for lacking written description, the independent claims 14 is also rejected under 112(a).

Claim 8 has the limitation of ‘adding the determined embedding vectors that fall within a first of the time slots into a first sum embedding vector, wherein the first sum embedding vector is the only embedding vector of the first time slot’, ‘adding the determined embedding vectors that fall within a second of the time slots into a second sum embedding vector, wherein the second sum embedding vector is the only embedding vector of the second time slot’, and ‘stacking the sum embedding vectors into the two-dimensional matrix’ which are described in the Specification of the Applicant in paragraph [0070]. However, no other information is provided as to how the limitation is performed. The algorithm or steps/procedures for these claimed functions is not explained at all or is not explained in sufficient detail (simply restating the function reciting in the claim is not necessarily sufficient) so that one of ordinary skill in the art would recognize that the applicant had possession of the claimed invention.
Dependent claims 16 share similar limitations as claim 8 above, and as a result as a result of claim 8 being rejected under 112(a) for lacking written description, the independent claims 16 is also rejected under 112(a).

	Independent claims 9 and 17 share similar limitations as independent claim 1 above, and as a result as a result of independent claim 1 being rejected under 112(a) for lacking written description, the independent claims 9 and 17 are also rejected under 112(a). Rejections under 112(a) are also made towards dependent claims 10-16 relying upon claim 9, and claims 18-20 relying upon claim 17 as well.




Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

	Claims 1, 4, 7, 9, 12, 15, 17, and 20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Ackerman et al (US 12050715 B2), hereinafter Ackerman.
Regarding claim 1, Ackerman discloses “a computer-implemented method, comprising: collecting historical event log data from host devices” ([Col. 30, lines 27-31] "FIG. 11 shows a system for event monitoring and response. In general, the system may include a number of compute instances 1102 that use local security agents 1108 to gather events 1106 from sensors 1104 into event vectors 1110, and then report these event vectors 1110 to a threat management facility 1112", wherein the event data being collected from sensors from a compute instance or compute instances, to which this process corresponds to the collection of event log data from host devices to then convert into vectors.); 
	“training a first model to convert textual log events of the historical event log data into event embedding vectors” ([Col. 30, lines 27-31] "FIG. 11 shows a system for event monitoring and response. In general, the system may include a number of compute instances 1102 that use local security agents 1108 to gather events 1106 from sensors 1104 into event vectors 1110", where events themselves are converted into event vectors, with [Col. 30, line 67-Col. 31, line 1] stating, 'events 1106 may be tokenized', and events can be assigned numbers or other identifiers to then be added to a vector, and in section [Col. 32, lines 21-24] and as shown in Fig. 11, a vector can be any size, and can encode any number of different events 1106 that may be tokenized, which allows the event vectors of Ackerman correspond to event embedded vectors of the applicant, such that the event vectors can go into a threat management facility, and be stored in an event stream for a second model of classifying behavior will utilize the vectors. Earlier in the prior art of Ackerman in [Col. 18, line 55-Col. 19, line 5], the security agent 306 is also described to apply machine learning models in order to detect the logs or other threats included in an endpoint 302, or other event types that may also occur while running programs, and logging those events.); 
	“training a second model to classify whether at least some of the event embedding vectors represent abnormal or potentially malicious behavior, wherein the second model is a hierarchical temporal event transformer model” ([Col. 30, lines 36-40] "The event stream 1114 may be analyzed with an analysis module 1118, which may in turn create entity models 1120 useful for detecting, e.g., unexpected variations in behavior of compute instances 1102… ", where the analysis module is capable of supplying the vectors and even preemptively analyzing some unusual behavior. [Col. 32, lines 38-44] "Each entity model 1120 may, for example, include a multi-dimensional description of events 1106 for an entity based on events 1106 occurring over time for that entity. This may be, e.g., a statistical model based on a history of events 1106 for the entity over time... entity models 1120 may, for example, be vector representations or the like of different events 1106 expected for or associated with an entity, and may also include information about the frequency, magnitude, or pattern of occurrence for each such event 1106", where the second model is capable of creating a statistical model that represents a time period of events that is then transformed into a vector for the statistical model, that takes into account events that have occurred various times in the events that the second model is being trained on. The categorization of the frequency, patterns of the events are important when detecting anomalies or dangerous behaviors of the events if they occur more frequently than other events, and therefore, corresponds to the hierarchical temporal event transformer model.),
	“wherein the second model is trained in a first phase and a second phase” ([Col. 32, lines 33-37] Second model uses the vectors from the various entities to determine whether an events are considered safe and/or malicious, or otherwise suspicious behavior, based on how the events play out. Creating the entity models and determining normal behavior consists of the first phase. [Col. 33, lines 5-7] includes the training according to anomalies or dangerous behaviors for the second phase of training the second model.),
	“generating a time-series matrix that is based on the embedding vectors generated by the first model formed by low-dimensional causal event streams with predetermined time-bins and labels” ([Col. 32, lines 38-41] Fig. 11, each entity model 1120 includes a multi-dimensional description of events 1106 based on event stream 1114 occurring over time for that entity, corresponding to generating a time-series matrix based on embedding vectors generated by first model for a period of recent logged event stream, where multi-dimensional can also be defined as ‘low-dimensional’. [Col. 31, lines 41-45] Events 1106 and event vectors 1110 can be labeled in several ways, including with process identifiers or include identification of an entity associated with the event 1106 or event vector 1110, which are defined as events that cause incidents to be logged in an event stream. [Col. 32, lines 38-44] Entity model 1120 can be a statistical model based on a history of events 1106 of event stream 1114 for the entity over time, such as via a window or rolling average of events 1106, where in the rolling average of events, each one is timestamped, which corresponds to pre-determined time-bins for events in an event stream, with the event streams being time stamped by threat management facility 1112 to record chronology, stated in [Col. 32, lines 29-31].);
“wherein the low-dimensional causal event streams include all recorded log events of the collected historical event log data” ([Col. 19, lines 36-41] When a connection has been established, events that can occur over a time frame are converted into an event vector and will be detected by a detection engine, can be done in a specified time frame from when it began.),
“wherein the recorded log events are collected over a plurality of days” ([Col. 32, lines 58-67] As events are collected to form an event stream, the entity can be monitored for a week, as an example.);
	“performing a convolution and projection of the time-series matrix for the period of the recent logged event stream to generate time-token embeddings for the period of the recent logged event stream” ([Col. 32, lines 38-44] Entity model 1120 can be a statistical model based on a history of events 1106 of event stream 1114 for the entity over time, such as via a window or rolling average of events 1106, corresponding to convolution of the time-series matrix, in conjunction with [Col. 32, lines 38-41] of the Applicant describing entity models including a multi-dimensional description of events 1106 occurring over time for the entity, corresponding to a projection of the time-series matrix for period of recent logged event stream, and multi-dimensional description of events occurring over time for the entity generating of time-token embeddings for period of recent logged event streams when taking time stamps in event vectors 1110 into account, stated in [Col. 32, lines 29-31].),
“applying the time-token embeddings for the period of the recent logged event stream to a hierarchical attention graph to generate a final classification (CLS) token embedding“ ([Col. 26, lines 59-64] Random forests are ensemble learning methods used for classification, and construct multitude of decisions trees at training time, outputting the class of the trees, this corresponds to generating a final classification (CLS) token embedding of the Applicant. In the context of Ackerman, input of these random forests can include safe and unsafe samples of events, as stated in [Col. 26, lines 22-24]. Also, multi-dimensional description of events occurring over time for the entity generating of time-token embeddings for period of recent logged event streams when taking time stamps in event vectors 1110 into account, stated in [Col. 32, lines 29-31].),
“and using the final CLS token embedding to pretrain the second model, wherein the pretraining comprises the second model classifying inputs as malicious or normal” ([Col. 34, lines 35-53] Entity model 1120 can determine whether an event vector 1110 should be classified as within the baseline activity characterized by entity model, a result of the output of the class of the trees, corresponding to using the final CLS token embedding to pretrain the second model.).
	“deploying the trained first model and the trained second model to predict a likelihood of a malicious cybersecurity event occurring within a first predetermined period of time that begins at the deployment of the trained models” ([Col. 34, lines 35-38] To deploy the first model and the second model in order to detect malicious or anomalous events, "the detection engine 1122 may compare new events 1106 generated by an entity, as recorded in the event stream 1114, to the entity model 1120 that characterizes a baseline of expected activity", which is able to obtain the event vectors from the data repository 1116, shown in Fig. 11. [Col. 34, lines 23-25] "Once an entity model 1120 has been created and a stable baseline established, the entity model 1120 may be deployed for use in monitoring prospective activity", in which the entity model being deployed corresponds to the trained second model being deployed to predict malicious events. With Fig. 3 as an enterprise network threat detection, various endpoint systems can be evaluated at one time, and the endpoint 302 is used to connect to the threat management facility 308, which is also in Fig. 11 as threat management facility 1112, which may also use a filter to "manage a flow of information from the data recorder 304 to a remote resource such as the threat detection tools 314 of the threat management facility 308", as stated in [Col. 19, lines 19-22]. When a connection has been established, a request for specific events, or events that can occur over a time frame, can monitor a data log, which is converted into an event vector and will be detected by a detection engine, can be done in a specified time frame from when it began.).
“wherein the likelihood is a classification output generated by the trained second model as a result of a two-dimensional matrix output by the trained first model being applied to the second model” ([Col. 34, lines 35-53] "The detection engine 1122 may compare new events 1106 generated by an entity, as recorded in the event stream 1114, to the entity model 1120 that characterizes a baseline of expected activity… comparison may use one or more vector distances such as a Euclidean distance, a Mahalanobis distance, a Minkowski distance, or any other suitable measurement of difference within the corresponding vector space. In another aspect, a k-nearest neighbor classifier may be used to calculate a distance between a point of interest and a training data set, or more generally to determine whether an event vector 1110 should be classified as within the baseline activity characterized by the entity model", where the comparison uses vector distances and operations to compare them, of which Minkowski, Euclidean, Mahalanobis are well known in the arts to perform operations in any number of dimensions, including two-dimensional matrices. [Col. 32, lines 38-41] Fig. 11, each entity model 1120 includes a multi-dimensional description of events 1106 based on event stream 1114, corresponding to a two-dimensional matrix output by the trained first model being applied to the trained second model. Furthermore, detection engine 1122 is applied to event stream 1114 to detect unusual or malicious activity based on entity models 1120, corresponding to a likelihood is a classification output generated by the trained second model of the Applicant.);
“and in response to a determination that the likelihood exceeds a predetermined threshold and a determination that a first of the host devices has been previously accessed by unauthorized malicious cybersecurity actors, instructing a first of the host devices to remain offline for a predetermined period of time and instructing at least one of the host devices to implement additional authentication techniques” ([Col. 9, lines 60-63] Risk scores can be determined for users by identity management facility 172, and [Col. 34, lines 12-16] also describes that different users may use the software differently, with behavior falling outside of the baseline corresponding to unauthorized malicious cybersecurity actors. [Col. 32, lines 38-41] Fig. 11, each entity model 1120 includes a multi-dimensional description of events 1106 based on event stream 1114, corresponding to a two-dimensional matrix output by the trained first model being applied to the trained second model, and in [Col. 35, lines 21-32] it is stated that when an event stream 1114 deviates from a baseline of expected activity that is described in the entity models 1120 for one or more entities, responses 1124 are initiated, including termination of network communication and quarantining the entity, corresponding to instructing a first of the host devices to remain offline for a predetermined period of time, in conjunction with filtering of a particular endpoints that should not exceed a predetermined time, such as an hour, stated in [Col. 37, lines 11-15]. [Col. 9, lines 60-63] The identity management facility 172 may determine a risk score for a user based on the events, with an identity provider also providing operations to confirm identity of the user based on the events received.),
“wherein the first host device has private user information stored thereon” ([Col. 37, lines 11-15] Instructing a first of the host devices to remain offline for a predetermined period of time, in conjunction with filtering of a particular endpoints that should not exceed a predetermined time, such as an hour. [Col. 34, lines 64-67] Events that indicate malicious activity include transmitting sensitive information from an endpoint, where sensitive information corresponds to private user information stored on the host device.).

Regarding claim 4, Ackerman discloses the computer-implemented method of claim 1, wherein the first model is different than the second model, wherein training of the second model during the first phase includes: ([Col. 32, lines 33-37] "In general, an analysis module 1118 may analyze the event stream 1114 to identify patterns of events 1106 within the event stream 1114 useful for identifying unusual or suspicious behavior… may include creating entity models 1120 that characterize behavior of entities", where the second model uses the vectors from the various entities to determine whether an events are considered safe and/or malicious, or otherwise suspicious behavior, based on how the events play out. Creating the entity models and determining normal behavior consists of the first phase based on the applicant, as Ackerman [Col. 33, lines 5-7] states that, "once an entity model is created, the entity model may usefully be updated, which may occur at any suitable intervals according to, e.g., the length of time to obtain a stable baseline... or any other factors", which includes the training according to anomalies or dangerous behaviors for the second phase of training the second model.):
determining a subset of the event embedding vectors of the first model to use as training targets, and causing the second model to estimate whether events associated with the training targets will occur within a third predetermined period of time from the current time ([Col. 32, lines 45-57] "The entity models 1120 may, for example, be vector representations or the like of different events 1106 expected for or associated with an entity... entity model 1120 may be based on an entity type... which may have a related event schema that defines the types of events 1106 that are associated with that entity type. This may usefully provide a structural model for organizing events 1106 and characterizing an entity before any event vectors 1110 are collected, and/or for informing what events 1106 to monitor for or associate with a particular entity", where certain events are selected by the analysis module 1118 in Fig. 11 are highlighted for their frequency, type of activity, or other metrics that indicate an event's behavior and whether that would cause issues with threats appearing, or whether it is considered safe behavior within the entity that the second model, or entity model, has to work with and surveil the events while monitoring said events. [Col. 32, line 58-Col. 33, line 2] "As an event stream 1114 is collected, a statistical model or the like may be developed for each event 1106 represented within the entity model so that a baseline of expected activity can be created... The entity model may also or instead be created by observing activity by the entity... monitoring the entity for an hour, for a day, for a week, or over any other time interval suitable for creating a model with a sufficient likelihood of representing ordinary behavior to be useful as a baseline as contemplated herein", where the events that were specified to be closely monitored or otherwise taken into account in Ackerman [Col. 32, lines 45-57], and are monitored with a prediction to see if the events will occur within the time that the system is monitored for from the time that the monitoring begins.).

Regarding claim 7, Ackerman discloses the computer-implemented method of claim 1, wherein the second model employs a neural-network architecture, wherein the first predetermined period of time ends an hour after the deployment of the trained models ([Col. 15, lines 54-56] "Classifiers may be used, such as neural network classifiers or other classifiers that may be trained by machine learning", in which the model used for training the vectors and determining a potential outcome or at least a training set will use neural network for its training. [Col. 34, lines 48-53] "a k-nearest neighbor classifier may be used to calculate a distance between a point of interest and a training data set, or more generally to determine whether an event vector 1110 should be classified as within the baseline activity characterized by the entity model", where the detection engine that has the vectors and the entity model will use in order to correlate behaviors exhibited by both to determine the state of the entity itself. [Col. 37, lines 11-15] Filtering of a particular endpoints that should not exceed a predetermined time after entity model is deployed, such as an hour, corresponding to a predetermined period of time ending an hour after deployment of trained models of the Applicant.).

Regarding claim 9, Ackerman [teaches/discloses] a computer program product with limitations that are also present in computer-implemented method of independent claim 1 as described above. Ackerman also discloses "a computer program product, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions readable and/or executable by a computer to cause the computer to: collect historical event log data from host devices" ([Col. 56, lines 63-67] "Embodiments disclosed herein may include computer program products comprising computer-executable code or computer-usable code that, when executing on one or more computing devices, performs any and/or all of the steps thereof. The code may be stored in a non-transitory fashion in a computer memory, which may be a memory from which the program executes (such as random-access memory associated with a processor), or a storage device such as a disk drive, flash memory or any other optical, electromagnetic, magnetic, infrared, or other device or combination of devices", with the products being housed inside of non-transitory storage mediums such as disk drives and other forms of physical media. The computer is capable of reading the code of the program product and be executable on the computing devices. [Col. 30, lines 27-31] "FIG. 11 shows a system for event monitoring and response. In general, the system may include a number of compute instances 1102 that use local security agents 1108 to gather events 1106 from sensors 1104 into event vectors 1110, and then report these event vectors 1110 to a threat management facility 1112", wherein the event data being collected from sensors from a compute instance or compute instances, to which this process corresponds to the collection of event log data from host devices to then convert into vectors.);

Regarding claim 12, Ackerman in view of Gopalakrishnan teaches the computer program product of claim 9 as described above. Ackerman in view of Gopalakrishnan also teaches the limitations of claim 4 above.

Regarding claim 15, Ackerman in view of Gopalakrishnan teaches the computer program product of claim 9 as described above. Ackerman in view of Gopalakrishnan also teaches the limitations of claim 7 above.

Regarding claim 17, Ackerman [teaches/discloses] a system with limitations that are also present in computer-implemented method of independent claim 1 as described above. Ackerman also discloses “a system, comprising: a processor” ([Col. 56, lines 30-40] “The hardware may include a general-purpose computer and/or dedicated computing device. This includes realization in one or more microprocessors, microcontrollers... along with internal and/or external memory.”);
and logic integrated with the processor, executable by the processor, or integrated with and executable by the processor, the logic being configured to: collect historical event log data from host devices ([Col. 56, lines 30-40] "The hardware may include a general-purpose computer and/or dedicated computing device. This includes realization in one or more microprocessors, microcontrollers... along with internal and/or external memory. This may also, or instead, include one or more application specific integrated circuits, programmable gate arrays, programmable array logic components, or any other device or devices that may be configured to process electronic signals", with the logic expressed as specific integrated circuits, programmable gate arrays, or other devices configured to process electronic signals. [Col. 30, lines 27-31] "FIG. 11 shows a system for event monitoring and response. In general, the system may include a number of compute instances 1102 that use local security agents 1108 to gather events 1106 from sensors 1104 into event vectors 1110, and then report these event vectors 1110 to a threat management facility 1112", wherein the event data being collected from sensors from a compute instance or compute instances, to which this process corresponds to the collection of event log data from host devices to then convert into vectors.);

Regarding claim 20, Ackerman in view of Gopalakrishnan teaches the computer program product of claim 9 as described above. Ackerman in view of Gopalakrishnan also teaches the limitations of claim 4 above.

Claims 2-3, 10-11, and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Ackerman in view of Lee et al. (US 20220094713 A1), hereinafter Lee.
Regarding claim 2, Ackerman discloses the computer-implemented method of claim 1, as described above. Ackerman does not appear to teach ‘wherein the first model is trained to use natural language modeling to map tokens of the textual log events of the historical event log data to the event embedding vectors via a lookup table’. However, Lee teaches that, wherein the first model is trained to use natural language modeling to map tokens of the textual log events of the historical event log data to the event embedding vectors via a lookup table, wherein the trained models are deployed at the current time ([0051-0054] "FIG. 3 illustrates machine learning models. A first machine model 310 may be trained to perform a natural language task… A feature extractor may provide feature vectors representing natural language text to the embedding 312 of the first machine learning model 310", where the embedding of the first model will assist in mapping the inputs into vector space, and the files themselves have timestamp information so to keep track of the files and the event that will take the files and vectorize the information, which [0032] states information about files and time stamps. Furthermore, the vectors were stored in memory, and as paragraph [0048] states, "this feature vector 140 and/or the set of features may be stored in the memory 120. The feature extractor 112 also may determine contextual information for the file", to which the vector has information just like a file does, including a time stamp for the files it represents.).
	Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of Ackerman, and Lee before them, to include Lee’s ‘wherein the first model is trained to use natural language modeling to map tokens of the textual log events of the historical event log data to the event embedding vectors via a lookup table’ in Ackerman’s ‘computer-implemented method, comprising: collecting historical event log data from host devices’. One would have been motivated to make such a combination to increase efficiency by having a "first model 310 may preprocess raw text and convert the text as a sequence of word tokens. The NLP may use a pre-defined vocabulary for tokenization as in traditional models, and/or sub-word tokenizers such as those employed in BERT and GPT". With that said, the embedding layer in the first model will take the input, that being raw text, and process that into vector embedding space, as taught by Lee [0054-0055]. With this in mind, the vectors that are used can be used in a machine learning model to then analyze the information to determine if the information present in the vector could either be anomalous or dangerous to the device or a network, as stated in Lee [0039].].

Regarding claim 3, Ackerman in view of Lee teaches the computer-implemented method of claims 1 and 2, as described above. Ackerman does not appear to teach, but Lee teaches that wherein the first model is a bidirectional encoder representations from transformers (BERT) model, wherein training the first model includes using masked-language modeling to learn to predict randomly masked words within a sentence that originates from the historical event log data, wherein the historical event log data is extracted from semi-structured event logs ([0053] "a BERT model may be trained to predict masked words in a sentence with large-scale datasets", where the BERT model is used on the first model to predict the masked word based on the context of a log or sentence, by taking into account words that appear behind of and ahead of the masked word that needs to be filled in. The BERT model is trained by sentences for predicting masked words in sentences from large-scale datasets, as stated in paragraphs [0052]-[0053]. [0033] Training data can include email messages that have been labeled as malicious or benign, and includes contextual information for the messages, such as time zone information, profile information, and even timestamps, which can correspond to historical event log data as training data. Furthermore, paragraph [0037] features a feature extractor 112 being configured to receive an analysis object, such as one or more of a file or a message, and paragraph [0014] further states that extracting words is used as training data for the models recited in paragraphs [0052]-[0053] of Lee. Finally, pre-training is used with a large unlabeled dataset and fine-tuning with a small, labelled dataset, corresponding to semi-structured event logs of the Applicant, stated in paragraph [0053] of Lee.).
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of Ackerman, and Lee before them, to include Lee’s ‘wherein the first model is a bidirectional encoder representations from transformers (BERT) model, wherein training the first model includes using masked-language modeling to learn to predict randomly masked words within a sentence that originates from the historical event log data, wherein the historical event log data is extracted from semi-structured event logs’ in Ackerman’s system performing ‘computer-implemented method, comprising: collecting historical event log data from host devices’ and Lee’s ‘wherein the first model is trained to use natural language modeling to map tokens of the textual log events of the historical event log data to the event embedding vectors via a lookup table’. One would have been motivated to make such a combination to increase efficiency by utilizing a BERT model on the first machine learning model 310, by using a transformer, a "natural language processing model such as a Bidirectional Encoder Representations from Transformers (BERT) model, transformer layers can be replaced with simplified adapters without significant loss of predictive ability", which simplifies the process of training information, and therefore, detection of anomalous or malicious activity, as taught by Lee [0004].

Regarding claim 10, Ackerman in view of Gopalakrishnan teaches the computer program product of claim 9 as described above. Ackerman in view of Gopalakrishnan also teaches the limitations of claim 2 above.

Regarding claim 11, Ackerman in view of Gopalakrishnan teaches the computer program product of claim 9 as described above. Ackerman in view of Gopalakrishnan also teaches the limitations of claim 3 above.

Regarding claim 18, Ackerman in view of Gopalakrishnan teaches the system of claim 17 as described above. Ackerman in view of Gopalakrishnan also teaches the limitations of claim 2 above.

Regarding claim 19, Ackerman in view of Gopalakrishnan teaches the system of claim 17 as described above. Ackerman in view of Gopalakrishnan also teaches the limitations of claim 3 above.

Claims 5-6, and 13-14 are rejected under 35 U.S.C. 103 as being unpatentable over Ackerman in view of Gopalakrishnan et al. (US 20230134546 A1), hereinafter Gopalakrishnan.
	Regarding claim 5, Ackerman discloses the method of claims 1, and 4 above. Ackerman does not appear to disclose, but Gopalakrishnan teaches also discloses “wherein the labels identify key detection events from a base cybersecurity application as labels for events of interest in training” ([Col. 21, lines 12-17] Fig. 4, threat management system contains a coloring system 410 as a component to support threat detection, with the coloring system used to label or color software objects to improve tracking and detection of potentially harmful activity, such as labeling files, executables, processes, and so forth with suitable information. As a result, the coloring system is capable of identifying key detection events to label events that are potentially malicious and train the threat management system further.).
	However, Gopalakrishnan teaches that, wherein training of the second model during the second phase includes: determining labeled examples, and causing the second model to classify whether the labeled examples represent abnormal or potentially malicious behavior ([0032] "the ML engine may use NLP to convert log text to numerical vectors and apply a trained classifier to the numerical vectors to generate a prediction", where the classifier is used to classify the data to generate a prediction of the model and the possibility of malware showing up. [0094] "FIG. 7 illustrates an example application of a model for analyzing a network threat associated with an event in accordance with some embodiments. Table 700 identifies a set of textual tokens and scores associated with an event log", where the vectors are classified as safe, anomalous, or dangerous based on the colors green, amber, or red as stated in both paragraph [0094] and [0067]. Fig. 7 shows an example of the second model of how the classification works.).
	Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of Ackerman, and Gopalakrishnan before them, to include Gopalakrishnan’s ‘wherein training of the second model during the second phase includes: determining labeled examples, and causing the second model to classify whether the labeled examples represent abnormal or potentially malicious behavior’ in Ackerman’s ‘computer-implemented method, comprising: collecting historical event log data from host devices’. One would have been motivated to make such a combination to increase security, as when the vectors are determined by a weighting score in Fig. 7, via the vectors being determined by a possibility of whether a malicious process will happen, and by using, for instance, three trees as shown in Fig. 7, have the trees vote for whether an attack could happen, as taught in Gopalakrishnan [0094].
	“wherein the labels identify key detection events from a base cybersecurity application as labels for events of interest in training” ();

	Regarding claim 6, Ackerman in view of Gopalakrishnan discloses the method of claims 1, 4, and 5 above. Ackerman also discloses outputs of the second model as a result of the use of the final CLS token embedding to train the second model to classify input includes classifications of input as being malicious or normal ([Col. 34, lines 35-53] Entity model 1120 can determine whether an event vector 1110 should be classified as within the baseline activity characterized by entity model, a result of the output of the class of the trees, corresponding to classify input includes classifications of input as being malicious or normal as stated by the Applicant.).
Ackerman does not appear to disclose, but Gopalakrishnan teaches that, wherein a first of the labeled examples is based on an anomaly, wherein a second of the labeled examples is based on detected malicious activities ([0067] "For example, the trained classifier may assign a label of Green to activity not detected to be a network attack, Amber where a low risk of a network attack is predicted, and Red to activity with a high risk of network attack", wherein the label of amber could designate an anomaly, as while something is indicated as wrong by the system, the low risk would not make it enough of a risk for an attack to happen, which corresponds to an anomaly in the applicant.).
	Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of Ackerman, and Gopalakrishnan before them, to include Gopalakrishnan’s ‘wherein a first of the labeled examples is based on an anomaly, wherein a second of the labeled examples is based on detected malicious activities’ in Ackerman’s ‘computer-implemented method, comprising: collecting historical event log data from host devices’. One would have been motivated to make such a combination to increase security, as when taking a look at Fig. 6, which details a process that leads into the results that are displayed in Fig. 7, an important aspect in labeling the state of a vector uses a color-coding system, with green denoting safe, amber as an anomaly, and red as a dangerous vector. With those in mind, they will be used to score and determine, with voting in Fig. 7, whether an attack could happen, as taught in Gopalakrishnan [0091].

	Regarding claim 13, Ackerman in view of Gopalakrishnan teaches the computer program product of claim 9 as described above. Ackerman in view of Gopalakrishnan also teaches the limitations of claim 5 above.

	Regarding claim 14, Ackerman in view of Gopalakrishnan teaches the computer program product of claim 9 as described above. Ackerman in view of Gopalakrishnan also teaches the limitations of claim 6 above.

Claims 8, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Ackerman in view of Pratt et al. (US 10673880 B1), hereinafter Pratt, and Sadaghiani et al. (US 20190034932 A1), hereinafter Sadaghiani.
Regarding claim 8, Ackerman discloses the method of claim 1, as outlined above, that wherein, for the host devices, embedding vectors for a recent logged event stream ([Col. 34, lines 35-38] "The detection engine 1122 may compare new events 1106 generated by an entity, as recorded in the event stream 1114, to the entity model 1120 that characterizes a baseline of expected activity", where the event stream corresponds to the embedding vectors of a logged event stream, which in this part of the method, emphasizes newer events as to prioritize potential issues that may not have had attention to be trained on earlier.);
generating the two-dimensional matrix that is based on the determined embedding vectors for the recent logged event stream, wherein deployment of the trained second model includes: causing the two-dimensional matrix to be applied to the trained second model to generate a classification output that represents the likelihood ([Col. 30, lines 40-42] "A detection engine 1122 may be applied to the event stream 1114 in order to detect unusual or malicious activity, e.g. based on the entity models 1120 or any other techniques". [Col. 34, lines 35-53] "The detection engine 1122 may compare new events 1106 generated by an entity, as recorded in the event stream 1114, to the entity model 1120 that characterizes a baseline of expected activity… comparison may use one or more vector distances such as a Euclidean distance, a Mahalanobis distance, a Minkowski distance, or any other suitable measurement of difference within the corresponding vector space. In another aspect, a k-nearest neighbor classifier may be used to calculate a distance between a point of interest and a training data set, or more generally to determine whether an event vector 1110 should be classified as within the baseline activity characterized by the entity model", where the comparison uses vector distances and operations to compare them, of which Minkowski, Euclidean, Mahalanobis are well known in the arts to perform operations in any number of dimensions, including two-dimensional matrices. [Col. 32, lines 38-41] "Each entity model 1120 may, for example, include a multi-dimensional description of events 1106 for an entity based on events 1106 occurring over time for that entity", in which the multi-dimensional description of events could also consist of the various vectors that are included in the entity model for training, the comparison of the data in the detection engine, by the nature of how it would work with Euclidean distances or other vector distances, would also require the event vectors to be considered together as a multi-dimensional matrix, or in the case of how its drawn in Fig. 11, the event stream appears to have two dimensions. [Col. 35, lines 21-24] "where the event stream 1114 deviates from a baseline of expected activity that is described in the entity models 1120 for one or more entities, any number of responses may be initiated by the response facility 1124... ", where responses can vary, but would include further scans or alerts, as stated in Ackerman [Col. 35, lines 25-32].);
wherein the generating the two-dimensional matrix comprises: dividing, based on a predetermined time condition, a time period during which the recently logged event stream occurred into a plurality of time slots ([Col. 32, lines 9-11] Groups of events 1106 fall within a window of time that can be reported as an event vector 1110, with a window of time corresponding to a time slot within a predetermined time period of the Applicant.);
adding the determined embedding vectors that fall within a first of the time slots into a first sum embedding vector, wherein the first sum embedding vector is the only embedding vector of the first time slot ([Col. 32, lines 29-31] Event vectors 1110 are time stamped to record chronology, with a first event vector in the window of time being separate from a second event vector, stated in [Col. 32, lines 4-8]. The first event vector being time stamped associated with the window of time corresponds to the embedding vector in the first sum embedding vector as the only embedding vector of the first time slot of the Applicant.);
adding the determined embedding vectors that fall within a second of the time slots into a second sum embedding vector, wherein the second sum embedding vector is the only embedding vector of the second time slot ([Col. 32, lines 4-8] Second event vector 1110 can be created and reported along with other temporally adjacent events 1106, while separate from a first event vector. The second event vector being time stamped associated with the window of time corresponds to the embedding vector in the second sum embedding vector as the only embedding vector of the second time slot of the Applicant.);
and stacking the sum embedding vectors into the two-dimensional matrix, wherein the classification output is a numerical score of a predetermined range of potential numerical scores ([Col. 32, lines 38-41] Fig. 11, each entity model 1120 includes a multi-dimensional description of events 1106 based on event stream 1114, corresponding to a two-dimensional matrix output by the trained first model being applied to the trained second model).
Ackerman discloses the method of ‘wherein deployment of the trained first model includes: causing the trained first model to determine, for the host devices, embedding vectors for a recent logged event stream’. Ackerman does not explicitly teach the method of ‘wherein the classification output is a numerical score of a predetermined range of potential numerical scores’.
However, Pratt teaches that “wherein the classification output is a numerical score of a predetermined range of potential numerical scores, wherein bounds of the predetermined range are a score of one and a score of one hundred, wherein the score of one represents a relatively highest likelihood that the malicious cybersecurity event will occur within the first predetermined period of time” (Within either Fig. 28 and 29, which have two different outcomes for when a threat indicator goes off or not, we see that with Fig. 28 in particular, we have both an anomaly rule and an anomaly model at play, and when an anomaly is detected by a rule, it trains the model to lookout for anomalies similar to the anomaly 1 found earlier. When another anomaly is detected, it could raise a "threat indicator" alert as shown in the figure. Furthermore, as Fig. 30 relates to the figures and the elements found in Figs. 28-29 and in Fig. 18, the process can extend to other anomalies and even other models, but, according to [Col. 56, lines 40-46], when a threat indicator is found in an IT environment based on the events collected from other users or devices, "Process 3200 continues at step 3208 with generating a pattern matching score based on a result of the comparing. In some embodiments, the pattern matching score is a value in a set range. For example, the resulting pattern matching score may be a value between 0 and 10 with 0 being the least likely to be a threat and 10 being the most likely to be a threat".).
Ackerman in view of Pratt does not appear to teach, but Sadaghiani teaches the limitation of “wherein bounds of the predetermined range are a score of one and a score of one hundred, wherein the score of one represents a relatively highest likelihood” ([0046] Digital threat score range is between zero and 100, where a higher global digital threat score value indicates a higher likelihood that the digital event involves digital fraud and/or abuse. Alternatively, the higher threat score value indicates a lower likelihood of digital fraud and/or abuse, which makes zero the highest likelihood of abuse and/or fraud. This corresponds to one representing a highest likelihood, as the value of zero is closest to one.);
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of Ackerman, and Pratt before them, to include Pratt’s ‘wherein the classification output is a numerical score of a predetermined range of potential numerical scores’ in Ackerman’s system performing ‘computer-implemented method, comprising: collecting historical event log data from host devices’, ‘wherein deployment of the trained first model includes: causing the trained first model to determine, for the host devices, embedding vectors for a recent logged event stream’, ‘generating a two-dimensional matrix that is based on the determined embedding vectors for the recent logged event stream, wherein deployment of the trained second model includes: causing the two-dimensional matrix to be applied to the trained second model to generate a classification output that represents the likelihood’. One would have been motivated to make such a combination to enhances security by showing a score to the user, and within "step 3208 with generating a pattern matching score based on a result of the comparing. In some embodiments, the pattern matching score is a value in a set range", with the conclusion of the process 3200 at "step 3210 with identifying a security threat if the pattern matching score satisfies a specified criterion", and in this example in the prior art, a score of 6 or greater indicates a threat being present in the entity, as stated in [Col. 56, lines 41-51]. This gives the user, the network, or the system an evaluation and lets the user know that an anomalous or malicious process is either running or stored, and will further be required to remediate the system if necessary.

Regarding claim 16, Ackerman in view of Gopalakrishnan teaches the computer program product of claim 9 as described above. Ackerman in view of Gopalakrishnan also teaches the limitations of claim 8 above.

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TOMMY MARTINEZ whose telephone number is (703)756-5651. The examiner can normally be reached Monday thru Friday 8AM-4PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jorge L. Ortiz-Criado can be reached at (571) 272-7624 on Monday thru Friday 7AM-7PM ET. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/T.M./Examiner, Art Unit 2496                                                                                                                                                                                                        
/JORGE L ORTIZ CRIADO/Supervisory Patent Examiner, Art Unit 2496
Read full office action
TRAINING AND DEPLOYING MODELS TO PREDICT CYBERSECURITY EVENTS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

TRAINING AND DEPLOYING MODELS TO PREDICT CYBERSECURITY EVENTS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email