DETAILED ACTION
This office action is in response to the amendments filed on 12/29/2025.
Claims 1-20 have been added.
Claims 4-6 are amended.
Claims 1-20 are presented for examination.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed 12/29/2025 in regards to 35 USC 102 rejections and 35 USC 103 rejections in Remarks pg. 1-3 have been fully considered but they are not persuasive.
Applicant argues in essence:
[a] “Both of claims 1 and 9 recite converting the sequence of data values making up a packet into pixel attribute values. Limb fails to disclose converting the actual data values that make up a packet into pixel image attribute values. Instead, Limb's "payload data" is expressly limited to payload lengths. As the Office Action quotes: "The method 400 may include, at action 408, capturing target payload data and target time data from a target flow of network packets ... In some embodiments, the target payload data may indicate lengths of payloads of the network packets in the target flow." (Limb: col. 17, 1. 62 - col. 18, 1. 9). Limb likewise states: "The target payload data may indicate lengths of payloads of the network packets in the target flow." (Limb: col. 2, 11. 37-41) and "payraw = flow['payloadlengths']" (Limb: col. 8). Accordingly, Limb generates pixel values from packet payload length(s) (and time differences), not from the underlying packet data values that make up the packet.”
In response to [a], examiner respectfully disagrees. The claim limitations read “receiving a first packet of a first flow from the network communication system, the first packet comprising a first sequence of data values; converting the first sequence of data values to a first plurality of pixel image attribute values” under broadest reasonable interpretation, these limitations require that the packet comprises a first sequence of data values, and that those values are converted to pixel image attribute values, however there is no indication on how this is performed, i.e. translation of bytes as what is seemed to be argued, or how much of the pixel image attribute values are “converted” from the first sequence of data values. The claim also does not exclude the conversion to use secondary information in the conversion process.
Limb discloses in col. 17 lines 62- col 18.line 18 “The method 400 may include, at action 408, capturing target payload data and target time data from a target flow of network packets between a target client application and a target server application. In some embodiments, the target payload data may indicate lengths of payloads of the network packets in the target flow. In some embodiments, the target time data may indicate time periods between arrivals of the network packets in the target flow. For example, the network analysis application 120 may capture, at action 408, target payload data (e.g., the payload data 114 n) and target time data (e.g., the time data 116 n) from a target flow of network packets (e.g., the flow of network packets 112 n) between a target client application (e.g., the client application 108 n) and a target server application (e.g., the server application 110 n).” It can be seen that at the very least the payload length data is obtained from the payload and converted into the values for the pixel, along with time data.
Therefore, Limb is stated to generate pixel values from the underlying data, i.e. packet length, to obtain pixel image attribute values.
[b] “Further, because a sequence of data values from a packet is converted to a plurality of pixel attribute values, claims 1 and 9 recite that each packet ultimately corresponds to a plurality of pixel image attribute values. Limb does not disclose this subject matter. Limb extracts only a single payload-length value per packet (again, "payload lengths") and then derives the plurality of pixel values for the image from the plurality of packets making up the flow. (Limb: col. 3, 11. 25-31; col. 20, 11. 40-47). While Limb further discusses converting time data into pixel values, the time data is likewise derived from external factors and not the packet data values. (Limb: col. 5, 11. 40-45, "because payload data and time data for a flow of network packets is available even where the payloads ... are encrypted.") Limb fails to disclose generating multiple pixel attribute values from the data values of a packet.”
In response to [b], examiner respectfully disagrees. The claim limitations read “receiving a first packet of a first flow from the network communication system, the first packet comprising a first sequence of data values; converting the first sequence of data values to a first plurality of pixel image attribute values” under broadest reasonable interpretation, these limitations require that the packet comprises a first sequence of data values, and that those values are converted to pixel image attribute values, however there is no indication on how this is performed, i.e. translation of bytes as what is seemed to be argued, or how much of the pixel image attribute values are “converted” from the first sequence of data values. The claim also does not exclude the conversion to use secondary information in the conversion process, such as time data.
As shown above in [a], in col. 17 lines 62- col 18.line 18, the payload length data is obtained from the payload of the data. The pixel attribute values are then converted from the payload length along with the time data in steps 202-212 in Fig. 2 to obtain the pixel attribute values.
Therefore, examiner respectfully disagrees and maintains 35 USC 102 rejection on the claims.
[c] “Additionally, as amended, claim 4 recites that an incoming packet is mapped to a first color channel and an outgoing packet is mapped to a second color. Limb, alone or in combination with Golic and Moussa, fails to teach or suggest this subject matter.
Limb only makes a brief, high-level mention that images "may instead be color images," without providing any details as to using distinct color channels to encode packet direction (incoming versus outgoing), or any rationale for doing so. The Office Action likewise cites Limb only for the general proposition that a "color image" may be generated, not for any teaching of channel-specific direction encoding.
Moussa does not cure this deficiency. As recognized in the portion quoted by the Examiner, Moussa teaches that "each pixel is represented by a combination of red (R), green (G) and blue (B) components such that each pixel has a resulting red/green/blue (RGB) color," and that "the particular values of R, G and B correspond to different profile attributes for a byte in the connection setup portion to which a pixel corresponds."
Accordingly, Moussa assigns three color components per pixel to encode multiple attributes for a byte, and does not discuss using distinct colors (or distinct channels) to indicate whether a packet is incoming or outgoing.
Indeed, modifying Moussa to use distinct channels solely to indicate packet direction would undermine Moussa's intended purpose, because the combination of R/G/B values being assigned to each pixel is integral to encoding the different profile attributes for the different bytes.”
In response to [c], examiner respectfully disagrees. Claim 4 recites in part “wherein:the first plurality of pixel image attribute values is a first plurality of one color channel of red-green-blue (RGB) values and corresponds to the incoming first packet; the second plurality of pixel image attribute values is a second plurality of one color channel of RGB values, different from the first color channel, and corresponds to the outgoing second packet”
Under broadest reasonable interpretation only requires that particular channels “correspond” to the incoming packet or “correspond” to the outgoing packet, however there is no link for the channels to represent the actual incoming or outgoing direction of the packet. The claims broadly state that at least one channel is broadly related to a first packet which happens to be incoming and a second channel is broadly associated with a second packet which happens to be outgoing, in other words, an incoming packets pixel attributes at least has a first channel, and an outgoing packets pixel attribute values has at least a second channel.
Therefore the claims do not require the packet direction to be indicated via the pixel attribute values, and Moussa is not relied upon to show this concept.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s) 1, 2, and 9 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Limb (US 11,159,560 B1).
Regarding Claim 1, Limb discloses A method for detecting malicious activity in a network communication system (Limb: “In some embodiments, at least one of the training client applications and the training server applications is a malicious application. In these embodiments, the method may further include determining that the likelihood that the target client application and/or the target server application matches the malicious application is above a threshold match value, and in response, performing a remedial action.” Fig. 1. Detecting malicious application activity in a network communication system as in claim 1), the method comprising:
receiving a first packet of a first flow from the network communication system, the first packet comprising a first sequence of data values (Limb: Fig. 4 408 col. 17 lines 62-col. 18 9 “The method 400 may include, at action 408, capturing target payload data and target time data from a target flow of network packets between a target client application and a target server application. In some embodiments, the target payload data may indicate lengths of payloads of the network packets in the target flow. In some embodiments, the target time data may indicate time periods between arrivals of the network packets in the target flow. For example, the network analysis application 120 may capture, at action 408, target payload data (e.g., the payload data 114 n) and target time data (e.g., the time data 116 n) from a target flow of network packets (e.g., the flow of network packets 112 n) between a target client application (e.g., the client application 108 n) and a target server application (e.g., the server application 110 n).” Packets are obtained of a flow between client and server. The contents of the packets are the sequence of data values.);
converting the first sequence of data values to a first plurality of pixel image attribute values (Limb: Fig. 4 410, col. 3 lines 12-24 “the generating of the target image from the target payload data and the target time data, may include normalizing the payload data, normalizing the time data, combining the normalized payload data with the normalized time data into a set of combined data points, placing the set of combined data points in a matrix beginning at a center of the matrix and spiraling outward from the center of the matrix, and converting the matrix into the image by converting each data point in the matrix into a pixel of the image.” Col. 18 lines 10-18 “The method 400 may include, at action 410, generating a target image from the target payload data and the target time data. For example, the network analysis application 120 may generate, at action 410, a target image from the target payload data (e.g., the payload data 114 n) and the target time data (e.g., the time data 116 n). This generation of this target image may be performed according to one or more actions of the method 200.” The sequence of data from the packets are normalized and converted into pixels, each pixel having a color/value as the resulting image is either color or grayscale, col. 3 lines 10-12 grayscale, and col. 17 lines 5-10 color image. The attributes of the pixels are the color/value of the pixel itself, as described in applicants specification para.0009 “the first plurality of pixel image attribute values is a first plurality of red-green-blue (RGB) values” and claim 4.);
generating a first portion of an image based on the first plurality of pixel image attribute values (Limb: Fig. 4 410, col. 3 lines 12-24 “converting the matrix into the image by converting each data point in the matrix into a pixel of the image.”, Col. 18 lines 10-18 “The method 400 may include, at action 410, generating a target image from the target payload data and the target time data. For example, the network analysis application 120 may generate, at action 410, a target image from the target payload data (e.g., the payload data 114 n) and the target time data (e.g., the time data 116 n). This generation of this target image may be performed according to one or more actions of the method 200.” Col. 3 lines 10-12 grayscale, and col. 17 lines 5-10 color image. Based on the generated pixels, a color or grayscale image is generated); and
processing the image using a trained neural network model to determine a likelihood of malicious activity in the first flow (Limb: col. 18 lines 23-40 “The method 400 may include, at action 414, employing the trained convolutional neural network to determine an output including an extent to which the target image matches one of the training images in order to determine a likelihood that the target client application and/or the target server application matches one of the training client applications and/or one of the training server applications. … in order to determine a likelihood that the target client application (e.g., the client application 108 n) and/or the target server application (e.g., the server application 1110 n) matches one of the training client applications (e.g., the client applications 108 a-108 c) and/or one of the training server applications (e.g., the server applications 110 a-110 c). …For example, where at least one of the training client applications (e.g., client applications 108 a-108 c) and the training server applications (e.g., server application 110 a-110 c) is a known malicious application, the convolutional neural network 120 may have been trained to recognize the same or similar malicious application (e.g., a similar application may be slightly different, but a match above a threshold, such as 90%, may nevertheless identify the similar application as matching above a threshold, which may indicate that the malware is at least in the same malware family)” the image is processed to determine using a trained neural network, trained in step 406 Fig. 4, if the image matches a known malicious image).
Regarding Claim 2, Limb discloses claim 1 as set forth above.
Limb further discloses generating a notification indicating that malicious activity has been detected (Limb: col. 18 lines 40-55 “the method 400 may further include determining that the likelihood that the target client application and/or the target server application matches the malicious application is above a threshold match value (e.g., above 90%), and in response, performing a remedial action. In these embodiments, the remedial action may include … alerting a user that the target client application and/or the target server application is likely a malicious application” upon detecting malicious activity, a user may be alerted.).
Regarding claim 9, it teaches all of the same steps as claim 1 but in A system for detecting malicious activity in a network communication system (Limb: “In some embodiments, at least one of the training client applications and the training server applications is a malicious application. In these embodiments, the method may further include determining that the likelihood that the target client application and/or the target server application matches the malicious application is above a threshold match value, and in response, performing a remedial action.” Fig. 1. Detecting malicious application activity in a network communication system as in claim 1), the system comprising: one or more processors; a memory in communication with the processor and having instructions stored thereon that, when executed, cause the processor to (Limb: fig. 5, col.7 lines 10-22, col. 20 lines 13-22).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 3, 5, 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Limb (US 11,159,560 B1) in view of Golic et al. (hereinafter Golic, US 2010/0284283 A1).
Regarding Claim 3, Limb discloses claim 1 as set forth above.
Limb further discloses determining if there are additional packets in the first flow; receiving a second packet of the first flow from the network communication system, the second packet comprising a second sequence of data values (Limb: Fig. 4 408, col. 17 lines 62-col. 18 line 9 “The method 400 may include, at action 408, capturing target payload data and target time data from a target flow of network packets between a target client application and a target server application. In some embodiments, the target payload data may indicate lengths of payloads of the network packets in the target flow. In some embodiments, the target time data may indicate time periods between arrivals of the network packets in the target flow. For example, the network analysis application 120 may capture, at action 408, target payload data (e.g., the payload data 114 n) and target time data (e.g., the time data 116 n) from a target flow of network packets (e.g., the flow of network packets 112 n) between a target client application (e.g., the client application 108 n) and a target server application (e.g., the server application 110 n).” the process obtains every packet from the first flow, therefore determines if there are additional packets and obtains them from the network communication system, i.e. the system of client and server);
converting the second sequence of data values to a second plurality of pixel image attribute values (Limb: Fig. 4 410, col. 3 lines 12-24 “the generating of the target image from the target payload data and the target time data, may include normalizing the payload data, normalizing the time data, combining the normalized payload data with the normalized time data into a set of combined data points, placing the set of combined data points in a matrix beginning at a center of the matrix and spiraling outward from the center of the matrix, and converting the matrix into the image by converting each data point in the matrix into a pixel of the image.” Col. 18 lines 10-18 “The method 400 may include, at action 410, generating a target image from the target payload data and the target time data. For example, the network analysis application 120 may generate, at action 410, a target image from the target payload data (e.g., the payload data 114 n) and the target time data (e.g., the time data 116 n). This generation of this target image may be performed according to one or more actions of the method 200.” The sequence of data from the packets are normalized and converted into pixels, each pixel having a color/value as the resulting image is either color or grayscale, col. 3 lines 10-12 grayscale, and col. 17 lines 5-10 color image. The attributes of the pixels are the color/value of the pixel itself, as described in applicants specification para.0009 “the first plurality of pixel image attribute values is a first plurality of red-green-blue (RGB) values” and claim 4.);
generating a second portion of the image based on the second plurality of pixel image attribute values (Limb: Fig. 4 410, col. 3 lines 12-24 “converting the matrix into the image by converting each data point in the matrix into a pixel of the image.”, Col. 18 lines 10-18 “The method 400 may include, at action 410, generating a target image from the target payload data and the target time data. For example, the network analysis application 120 may generate, at action 410, a target image from the target payload data (e.g., the payload data 114 n) and the target time data (e.g., the time data 116 n). This generation of this target image may be performed according to one or more actions of the method 200.” Col. 3 lines 10-12 grayscale, and col. 17 lines 5-10 color image. Based on the generated pixels, a color or grayscale image is generated. The data points that correspond to this second packet becomes part of the image, the portion of the image corresponding to those data points is the second portion of the image.); and
processing the image using the trained neural network model to determine the likelihood of malicious activity in the first flow (Limb: col. 18 lines 23-40 “The method 400 may include, at action 414, employing the trained convolutional neural network to determine an output including an extent to which the target image matches one of the training images in order to determine a likelihood that the target client application and/or the target server application matches one of the training client applications and/or one of the training server applications. … in order to determine a likelihood that the target client application (e.g., the client application 108 n) and/or the target server application (e.g., the server application 1110 n) matches one of the training client applications (e.g., the client applications 108 a-108 c) and/or one of the training server applications (e.g., the server applications 110 a-110 c). …For example, where at least one of the training client applications (e.g., client applications 108 a-108 c) and the training server applications (e.g., server application 110 a-110 c) is a known malicious application, the convolutional neural network 120 may have been trained to recognize the same or similar malicious application (e.g., a similar application may be slightly different, but a match above a threshold, such as 90%, may nevertheless identify the similar application as matching above a threshold, which may indicate that the malware is at least in the same malware family)” the image is processed to determine using a trained neural network, trained in step 406 Fig. 4, if the image matches a known malicious image).
However Limb does not explicitly disclose processing the image using the trained neural network model to determine an updated likelihood of malicious activity in the first flow, as in a first set of packets are analyzed, and then a first set of packets plus the next packet are then analyzed.
Golic discloses processing the network data to determine an updated determination of malicious activity in the first flow (Golic: para.0035 “In particular, the anomalous traffic to be detected can be due to (D)DoS attacks, SPAM and/or SPIT attacks, scanning attacks, as well as malicious software attacks. “ para.0092 “In a third embodiment of the detection method 200, a moving window of increasing length is defined. Such moving window extends from a chosen initial time up to the current time, and each time, the ending point of the moving window advances .tau. units of time, where r determines the resolution in time for detecting the anomalous changes in traffic.” Para.0053 “The variation quantity .DELTA. is compared, in a comparison step 206 (COMPARE), with comparison value such as a threshold value Thr. According to said comparison step 206, if the threshold value Thr is exceeded, then an anomaly is detected (branch Yes) and an alarm signal ALARM is generated in an alarm issuing step 207” para.0054 “ Following a positive (Yes) or negative (No) anomaly detection, the detection method 200 can be repeated in connection with further packet flow portions. ” software attacks can be detected from network data in a sliding window method, and further packets are considered in combination with previous packets in order to determine malicious activity in the packets.).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date to combine Limb with Golic in order to incorporate processing the network data to determine an updated determination of malicious activity in the first flow, such that the image based packet analysis in Limb is repeated using further packets to determine an updated likelihood of malicious activity.
One of ordinary skill in the art would have been motivated to combine because of the expected benefit of improved detection by using sliding windows (Golic: para.0021).
Regarding Claim 5, Limb-Golic discloses claim 3 as set forth above.
Limb further discloses wherein the first packet from the network communication system is an incoming packet (Limb: Fig. 4 408 col. 17 lines 62-col. 18 9 “The method 400 may include, at action 408, capturing target payload data and target time data from a target flow of network packets between a target client application and a target server application.” The packets are from the flow between client and server and include both packets to and from the server and client).
Regarding claim 6, Limb-Golic discloses claim 5 as set forth above.
Limb further discloses wherein the second packet from the network communication system is an outgoing packet (Limb: Fig. 4 408 col. 17 lines 62-col. 18 9 “The method 400 may include, at action 408, capturing target payload data and target time data from a target flow of network packets between a target client application and a target server application.” The packets are from the flow between client and server and include both packets to and from the server and client).
Claim(s) 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Limb (US 11,159,560 B1) in view of Golic et al. (hereinafter Golic, US 2010/0284283 A1) in view of El-Moussa et al. (hereinafter Moussa, Us 2018/0115567 A1).
Regarding Claim 4, Limb-Golic teaches claim 6 as set forth above.
Limb further discloses wherein: the first plurality of pixel image attribute values is a first plurality of one color channel of values (Limb: Fig. 4 410, col. 3 lines 12-24 “converting the matrix into the image by converting each data point in the matrix into a pixel of the image.”, Col. 18 lines 10-18 “The method 400 may include, at action 410, generating a target image from the target payload data and the target time data. For example, the network analysis application 120 may generate, at action 410, a target image from the target payload data (e.g., the payload data 114 n) and the target time data (e.g., the time data 116 n). This generation of this target image may be performed according to one or more actions of the method 200.” Col. 3 lines 10-12 grayscale, and col. 17 lines 5-10 color image. Based on the generated pixels, a color or grayscale image is generated. see Fig. 3A-C, wherein each pixel has a different value);
the second plurality of pixel image attribute values is a second plurality of one color channel of of values, different from the first color channel (Limb: Fig. 4 410, col. 3 lines 12-24 “converting the matrix into the image by converting each data point in the matrix into a pixel of the image.”, Col. 18 lines 10-18 “The method 400 may include, at action 410, generating a target image from the target payload data and the target time data. For example, the network analysis application 120 may generate, at action 410, a target image from the target payload data (e.g., the payload data 114 n) and the target time data (e.g., the time data 116 n). This generation of this target image may be performed according to one or more actions of the method 200.” Col. 3 lines 10-12 grayscale, and col. 17 lines 5-10 color image. Based on the generated pixels, a color or grayscale image is generated, see Fig. 3A-C, wherein each pixel has a different value);
However Limb-Golic does not explicitly disclose wherein:the first plurality of pixel image attribute values is a first plurality of one color channel of red-green-blue (RGB) values and corresponds to the incoming first packet; the second plurality of pixel image attribute values is a second plurality of one color channel of RGB values, different from the first color channel, and corresponds to the outgoing second packet; and, for the incoming first packet, the first color channel is populated with the first plurality of pixel image attribute values and a second color channel is set to zero, and for the outgoing second packet, the first color channel is set to zero and the second color channel is populated with the second plurality of pixel image attribute values.
Moussa discloses wherein: the first plurality of pixel image attribute values is a first plurality of one color channel of red-green-blue (RGB) values and corresponds to the incoming first packet (Moussa: Fig. 19 a-b para.0120 “The profile image 1900 is a matrix such as a raster or bitmapped image in which each pixel is represented by a combination of red (R) 1902, green (G) 1904 and blue (B) 1908 components such that each pixel has a resulting red/green/blue (RGB) color, each color component having a range 0 to 255. The particular values of R, G and B correspond to different profile attributes for a byte in the connection setup portion to which a pixel corresponds.” The RGB channel values are set to particular values related to the bytes of each packet. Therefore the pixel image attribute values for that byte are for a particular packet, therefore corresponding to that packet. For example an incoming packet in Fig. 4);
the second plurality of pixel image attribute values is a second plurality of one color channel of RGB values, different from the first color channel, and corresponds to the outgoing second packet (Moussa: Fig. 19 a-b para.0120 “The profile image 1900 is a matrix such as a raster or bitmapped image in which each pixel is represented by a combination of red (R) 1902, green (G) 1904 and blue (B) 1908 components such that each pixel has a resulting red/green/blue (RGB) color, each color component having a range 0 to 255. The particular values of R, G and B correspond to different profile attributes for a byte in the connection setup portion to which a pixel corresponds.” The RGB channel values are set to particular values related to the bytes of each packet. Therefore the pixel image attribute values for that byte are for a particular packet, therefore corresponding to a second packet, for example a packet going in the other direction in Fig. 4.);
and, for the incoming first packet, the first color channel is populated with the first plurality of pixel image attribute values and a second color channel is set to zero (Moussa: Fig. 19 a-b para.0120 “The profile image 1900 is a matrix such as a raster or bitmapped image in which each pixel is represented by a combination of red (R) 1902, green (G) 1904 and blue (B) 1908 components such that each pixel has a resulting red/green/blue (RGB) color, each color component having a range 0 to 255. The particular values of R, G and B correspond to different profile attributes for a byte in the connection setup portion to which a pixel corresponds.” The RGB channel values are set to particular values related to the bytes of each packet, therefore the bytes of a flow may have no affinity towards a particular attribute and may be set to zero, whereas it may have affinity towards another attribute and comprise non zero values.), and
for the outgoing second packet, the first color channel is set to zero and the second color channel is populated with the second plurality of pixel image attribute values (Moussa: fig. 19 a-bpara.0120 “The profile image 1900 is a matrix such as a raster or bitmapped image in which each pixel is represented by a combination of red (R) 1902, green (G) 1904 and blue (B) 1908 components such that each pixel has a resulting red/green/blue (RGB) color, each color component having a range 0 to 255. The particular values of R, G and B correspond to different profile attributes for a byte in the connection setup portion to which a pixel corresponds.” Similarly, a second row of values may correspond to attributes the first row did not, but has no affinity towards attributes the first row has values towards, causing the opposite attributes to be set to zero.).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Limb-Golic with Moussa in order to incorporate wherein:the first plurality of pixel image attribute values is a first plurality of one color channel of red-green-blue (RGB) values and corresponds to the incoming first packet; the second plurality of pixel image attribute values is a second plurality of one color channel of RGB values, different from the first color channel, and corresponds to the outgoing second packet; and, for the incoming first packet, the first color channel is populated with the first plurality of pixel image attribute values and a second color channel is set to zero, and for the outgoing second packet, the first color channel is set to zero and the second color channel is populated with the second plurality of pixel image attribute values.
One of ordinary skill in the art would have been motivated to combine because of the expected benefit of improved data representation for detection of malware in traffic (Moussa: para.0120).
Claim(s) 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Limb (US 11,159,560 B1) in view of Nagai et al. (hereinafter Nagai, US 2023/0412624 A1).
Regarding claim 7, Limb discloses claim 1 as set forth above.
However Limb does not explicitly disclose wherein the first sequence of data values comprises at least one hexadecimal byte value.
Nagai discloses wherein the first sequence of data values comprises at least one hexadecimal byte value (Nagai: para.0022 “For example, the feature quantity generation unit 122 regards the payload of each packet as a hexadecimal number byte string and transforms each byte into a decimal number to generate the feature quantity.” The packet data is in hexadecimal format.).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Limb with Nagai in order to incorporate herein the first sequence of data values comprises at least one hexadecimal byte value.
One of ordinary skill in the art would have been motivated to combine because of the expected benefit of known efficiency of data representation using hexadecimal notation (Nagai: para.0022).
Claim(s) 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Limb (US 11,159,560 B1) in view of Nagai et al. (hereinafter Nagai, US 2023/0412624 A1) in view of El-Moussa et al. (hereinafter Moussa, Us 2018/0115567 A1).
Regarding Claim 8, Limb-Nagai discloses claim 7 as set forth above.
Limb further discloses converting the first sequence of data values to the first plurality of pixel image attribute values (Limb: Fig. 4 410, col. 3 lines 12-24 “the generating of the target image from the target payload data and the target time data, may include normalizing the payload data, normalizing the time data, combining the normalized payload data with the normalized time data into a set of combined data points, placing the set of combined data points in a matrix beginning at a center of the matrix and spiraling outward from the center of the matrix, and converting the matrix into the image by converting each data point in the matrix into a pixel of the image.” Col. 18 lines 10-18 “The method 400 may include, at action 410, generating a target image from the target payload data and the target time data. For example, the network analysis application 120 may generate, at action 410, a target image from the target payload data (e.g., the payload data 114 n) and the target time data (e.g., the time data 116 n). This generation of this target image may be performed according to one or more actions of the method 200.” The sequence of data from the packets are normalized and converted into pixels, each pixel having a color/value as the resulting image is either color or grayscale, col. 3 lines 10-12 grayscale, and col. 17 lines 5-10 color image. The attributes of the pixels are the color/value of the pixel itself, as described in applicants specification para.0009 “the first plurality of pixel image attribute values is a first plurality of red-green-blue (RGB) values” and claim 4.)
However Limb does not explicitly disclose wherein converting the first sequence of data values to the first plurality of pixel image attribute values comprises: converting the at least one hexadecimal byte value to at least one decimal value; and assigning a corresponding color scale value to the at least one decimal value.
Nagai discloses converting the at least one hexadecimal byte value to at least one decimal value (Nagai: para.0022 “For example, the feature quantity generation unit 122 regards the payload of each packet as a hexadecimal number byte string and transforms each byte into a decimal number to generate the feature quantity.” The packet data is in hexadecimal format.).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Limb with Nagai in order to incorporate herein the first sequence of data values comprises at least one hexadecimal byte value.
One of ordinary skill in the art would have been motivated to combine because of the expected benefit of known efficiency of data representation using hexadecimal notation (Nagai: para.0022).
However, while Limb discloses the concept of converting decimal values to pixels, such as in col. 8 1-20, col. 12 25-27, and col.1 7 5-10 color images, Limb-Nagai does not explicitly disclose assigning a corresponding color scale value to the at least one decimal value.
Moussa discloses assigning a corresponding color scale value to the at least one decimal value (Moussa: para.0120 “The profile image 1900 is a matrix such as a raster or bitmapped image in which each pixel is represented by a combination of red (R) 1902, green (G) 1904 and blue (B) 1908 components such that each pixel has a resulting red/green/blue (RGB) color, each color component having a range 0 to 255. The particular values of R, G and B correspond to different profile attributes for a byte in the connection setup portion to which a pixel corresponds….For example, a first byte might typically exhibit normalized coefficient/entropy values about a mean value of 100 with a maximum deviation of 10. Such a byte can be profiled by simply identifying the mean and deviation values. A profile employing a mean and deviation value where a pixel R value corresponds to the mean and a pixel G value corresponds to the deviation can be identified as profile 1. The profile identifier can itself be encoded in the pixel B value so that, at runtime, the meaning of the R and G values can be determined. Other alternative profile types can therefore be employed within the same profile matrix (image), distinguished by the pixel B value itself.” Based on the values of each byte, a particular value for each RBG value can be set.).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Limb-Golic with Moussa in order to incorporate assigning a corresponding color scale value to the at least one decimal value, and apply this concept to converting the first sequence of data values to the first plurality of pixel image attribute values step of Limb.
One of ordinary skill in the art would have been motivated to combine because of the expected benefit of improved data representation for detection of malware in traffic (Moussa: para.0120).
Claim(s) 10-15, is/are rejected under 35 U.S.C. 103 as being unpatentable over Limb (US 11,159,560 B1) in view of Lim et al. (hereinafter Lim, US 2022/0174083 A1) in view of Nakamura et al. (hereinafter Nakamura, WO 2019240054 A1).
Regarding Claim 10, Limb discloses A method for detecting malicious activity in a network communication system (Limb: “In some embodiments, at least one of the training client applications and the training server applications is a malicious application. In these embodiments, the method may further include determining that the likelihood that the target client application and/or the target server application matches the malicious application is above a threshold match value, and in response, performing a remedial action.” Fig. 1. Detecting malicious application activity in a network communication system as in claim 1), the method comprising:
receiving, from a bi-directional packet flow, an incoming packet comprising a first set of bytes and an outgoing packet comprising a second set of bytes (Limb: Fig. 4 408 col. 17 lines 62-col. 18 9 “The method 400 may include, at action 408, capturing target payload data and target time data from a target flow of network packets between a target client application and a target server application. In some embodiments, the target payload data may indicate lengths of payloads of the network packets in the target flow. In some embodiments, the target time data may indicate time periods between arrivals of the network packets in the target flow. For example, the network analysis application 120 may capture, at action 408, target payload data (e.g., the payload data 114 n) and target time data (e.g., the time data 116 n) from a target flow of network packets (e.g., the flow of network packets 112 n) between a target client application (e.g., the client application 108 n) and a target server application (e.g., the server application 110 n).” Packets are obtained of a flow between client and server. The contents of the packets are the first set of bytes. Seen in Fig. 1, the packets are from the flow of packets 112a-n, and are bidirectional, therefore both incoming and outgoing packets are obtained.);
generating a first one-dimensional pixel sequence (Limb: Col. 7 line 47-67 “The method 200 may include, at action 214, combining the normalized payload data with the normalized time data into a set of combined data points. In some embodiments, the combining of the normalized payload data with the normalized time data into the set of combined data points may include interleaving the normalized payload data and the normalized time data into an array of the set of combined data points.” Seen in Fig. 2 step 214, wherein the payload data and time data are combined. Each pair of payload and time data is a one dimensional pixel sequence for a packet, a pair for one of the incoming data packets is a first pixel sequence.);
generating a second one-dimensional pixel sequence (Limb: Col. 7 line 47-67 “The method 200 may include, at action 214, combining the normalized payload data with the normalized time data into a set of combined data points. In some embodiments, the combining of the normalized payload data with the normalized time data into the set of combined data points may include interleaving the normalized payload data and the normalized time data into an array of the set of combined data points.” Seen in Fig. 2 step 214, wherein the payload data and time data are combined. Each pair of payload and time data is a one dimensional pixel sequence for a packet. A pair for one of the outgoing packets is a second sequence.);
generating an image having a first dimension of N pixels, the image comprising the first one-dimensional pixel sequence in a first color channel in a first line of the image (Limb: Fig. 4 410, col. 3 lines 12-24 “converting the matrix into the image by converting each data point in the matrix into a pixel of the image.”, Col. 18 lines 10-18 “The method 400 may include, at action 410, generating a target image from the target payload data and the target time data. For example, the network analysis application 120 may generate, at action 410, a target image from the target payload data (e.g., the payload data 114 n) and the target time data (e.g., the time data 116 n). This generation of this target image may be performed according to one or more actions of the method 200.” Col. 7 lines 54-66 “Then, the method 200 may include, at action 216, placing the set of combined data points in a matrix beginning at a center of the matrix and spiraling outward from the center of the matrix. In some embodiments, the placing of the set of combined data points in the matrix may include placing the set of combined data points in the matrix beginning at the center of the matrix and spiraling outward in a clockwise direction from the center of the matrix.” Based on the generated pixels, a color or grayscale image is generated. Each data point in the matrix is laid out into an image such as in Fig. 3A with has a dimension N, i.e. width of the square. As each data point corresponds to a pixel in that image, and are laid out in a spiral pattern starting from the middle, a pair of pixels in the same row may be the first one dimensional pixel sequence. col. 17 lines 5-10 “Although the images 300 a-300 c are illustrated in FIGS. 3A-3C as lossless grayscale images, it is understood that the images 300 a-300 c may instead be color images “color image, therefore the pair of pixels may be represented by color channels.)and
the second one-dimensional pixel sequence in a second color channel in a distinct second line of the image (Limb Col. 7 lines 54-66 “Then, the method 200 may include, at action 216, placing the set of combined data points in a matrix beginning at a center of the matrix and spiraling outward from the center of the matrix. In some embodiments, the placing of the set of combined data points in the matrix may include placing the set of combined data points in the matrix beginning at the center of the matrix and spiraling outward in a clockwise direction from the center of the matrix.” Each data point in the matrix is laid out into an image such as in Fig. 3A with has a dimension N, i.e. width of the square. As each data point corresponds to a pixel in that image, and are laid out in a spiral pattern starting from the middle, a pair of pixels in another row may be the second one dimensional pixel sequence. col. 17 lines 5-10 “Although the images 300 a-300 c are illustrated in FIGS. 3A-3C as lossless grayscale images, it is understood that the images 300 a-300 c may instead be color images “color image, therefore the pair of pixels may be represented by different color channels, as the colors would represent the data values.); and
processing the image using a trained convolutional-neural-network (CNN) classifier to determine a likelihood of malicious activity in the network communication system (Limb: col. 18 lines 23-40 “The method 400 may include, at action 414, employing the trained convolutional neural network to determine an output including an extent to which the target image matches one of the training images in order to determine a likelihood that the target client application and/or the target server application matches one of the training client applications and/or one of the training server applications. … in order to determine a likelihood that the target client application (e.g., the client application 108 n) and/or the target server application (e.g., the server application 1110 n) matches one of the training client applications (e.g., the client applications 108 a-108 c) and/or one of the training server applications (e.g., the server applications 110 a-110 c). …For example, where at least one of the training client applications (e.g., client applications 108 a-108 c) and the training server applications (e.g., server application 110 a-110 c) is a known malicious application, the convolutional neural network 120 may have been trained to recognize the same or similar malicious application (e.g., a similar application may be slightly different, but a match above a threshold, such as 90%, may nevertheless identify the similar application as matching above a threshold, which may indicate that the malware is at least in the same malware family)” the image is processed to determine using a trained neural network, trained in step 406 Fig. 4, if the image matches a known malicious image).
However Limb does not explicitly disclose processing the first set of bytes to obtain a first sequence of bytes from the incoming packet; processing the second set of bytes to obtain a second sequence of bytes from the outgoing packet; generating a first one-dimensional pixel sequence having a pixel count N and corresponding to the first sequence of bytes; generating a second one-dimensional pixel sequence having the pixel count N and corresponding to the second sequence of bytes; determining color channels for the first one-dimensional pixel sequence and the second one-dimensional pixel sequence based on incoming or outgoing packet status.
Lim discloses processing the first set of bytes to obtain a first sequence of bytes from the incoming packet (Lim: para.0108-0113 “In a record parsing operation 307, payloads may be analyzed to extract fragmented records from the payloads. Since records are spread over TCP segments, it may be determined whether a combined payload is a series of record chunks in a heuristic way, and each record may be separated according to the following rules…. 4) A record header may represent that a 5-byte region contains the values described in 1) to 3). When a record length is 1, the parsing module may separate a following 1+5 byte region from the payload.” A payload bytes of a packet are processed to obtain a first sequence of bytes. This may correspond to record r1 in para.0118 and Fig. 4.);
processing the second set of bytes to obtain a second sequence of bytes from the outgoing packet (Lim: para.0108-0113 “In a record parsing operation 307, payloads may be analyzed to extract fragmented records from the payloads. Since records are spread over TCP segments, it may be determined whether a combined payload is a series of record chunks in a heuristic way, and each record may be separated according to the following rules…. 4) A record header may represent that a 5-byte region contains the values described in 1) to 3). When a record length is 1, the parsing module may separate a following 1+5 byte region from the payload.” A payload bytes of a packet are processed to obtain a first sequence of bytes. Para.0116 “Series of records received by the client terminal 201 and the server 203 may be aggregated separately. Meanwhile, the order of parsed records may be chronologically retained to merge the two independent record sequences into a single sorted sequence.” Seen in Fig. 2, the packets are between client and server, and while incoming, say received by the client, and outgoing, received by the server, are analyzed separately, they are ultimately combined. The second record may be any of r2-n, para.0118, Fig. 4.);
generating a first one-dimensional pixel sequence having a pixel count N and corresponding to the first sequence of bytes (Lim: Fig. 4, para.0118 “In a pre-processing operation 309, first n records r1, r2, . . . , rn obtained from the given record sequence may be arranged, and each of the n records r1, r2, . . . , rn may be trimmed to b bytes.” Para.0128 “Each 1D auto-encoder includes four encoding layers and four decoding layers, and the encoding layers may generate a series of m-dimension reduction vectors (E1, E2, . . . En∈
PNG
media_image1.png
38
29
media_image1.png
Greyscale
m).” each record is converted into a 1xm pixel sequence E1-n with pixel count m);
generating a second one-dimensional pixel sequence having the pixel count N and corresponding to the second sequence of bytes (Lim: Fig. 4, para.0118 “In a pre-processing operation 309, first n records r1, r2, . . . , rn obtained from the given record sequence may be arranged, and each of the n records r1, r2, . . . , rn may be trimmed to b bytes.” Para.0128 “Each 1D auto-encoder includes four encoding layers and four decoding layers, and the encoding layers may generate a series of m-dimension reduction vectors (E1, E2, . . . En∈
PNG
media_image1.png
38
29
media_image1.png
Greyscale
m).” each record is converted into a 1xm pixel sequence E1-n with pixel count m);
generating an image having a first dimension of N pixels, the image comprising the first one-dimensional pixel sequence in a first color channel in a first line of the image, the second one-dimensional pixel sequence in a second color channel in a distinct second line of the image (Lim: Fig. 4 para.0129 “The generated vectors E1, E2, . . . , and En may be spliced to form a two-dimensional (2D) image-like input E∈Rn×m of a 2D auto-encoder, and the 2D auto-encoder may perform another feature learning task.” Each one dimensional array of pixels are stacked into a 2d image E as in Fig. 4, with length m. para.0157 “In the first and second rows, pixel values are normalized, and when the value becomes closer to 1.0, the pixel is colored with a darker tone.” Each pixel has a corresponding color.).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Limb with Lim in order to incorporate processing the first set of bytes to obtain a first sequence of bytes from the incoming packet; processing the second set of bytes to obtain a second sequence of bytes from the outgoing packet; generating a first one-dimensional pixel sequence having a pixel count N and corresponding to the first sequence of bytes; generating a second one-dimensional pixel sequence having the pixel count N and corresponding to the second sequence of bytes; generating an image having a first dimension of N pixels, the image comprising the first one-dimensional pixel sequence in a first color channel in a first line of the image, the second one-dimensional pixel sequence in a second color channel in a distinct second line of the image.
One of ordinary skill in the art would have been motivated to combine because of the expected benefit of improving security in the network (Lim: para.0008).
However Limb-Lim does not explicitly disclose determining color channels for the first one-dimensional pixel sequence and the second one-dimensional pixel sequence based on incoming or outgoing packet status.
Nakamura discloses determining color channels for the first one-dimensional pixel sequence and the second one-dimensional pixel sequence based on incoming or outgoing packet status (Nakamura: para.0049 “FIG. 6 is a diagram for explaining image conversion for one packet in the embodiment of the present invention. In the present embodiment, five types of fields of a time at which a packet is received (timestamp), a transmission source port number (Src Port), a destination port number (Dst Port), a sequence number (Seq), and a window size (Win) are set as imaging targets, and are handled as a 5 ? 1 array having values corresponding to these fields as elements.” Para.0052 “In this way, for example, when the same port number is continuously designated, the same color is obtained, and when the port number is periodically changed, a specific pattern is formed in the image, so that the feature after imaging becomes clear. In addition, the color arrangement of each pixel is a gray scale as described above, and in the present embodiment, when the value after normalization is 0, the color is black, and when the value is 1, the color is white. In this embodiment, an image is generated by generating a 100 × 5 array and then transposing the array. Thus, an image of 5 × 100 pixels is generated.” See Fig. 6-8. The array for each packet is generated to include destination port as a particular color, i.e. determine color channel information in order to establish a visible color change sequence based on changes in source and destination, i.e. incoming or outgoing status represented as ports.).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Limb-Lim with that of Nakamura in order to incorporate determining color channels for the first one-dimensional pixel sequence and the second one-dimensional pixel sequence based on incoming or outgoing packet status.
One of ordinary skill in the art would have been motivated to combine because of the expected benefit of identifying patterns in malicious attacks in the packets (Nakamura: para.0031, para.0052, para.0072).
Regarding Claim 11, Limb-Lim-Nakamura discloses claim 10 as set forth above.
Limb further discloses capturing, for each packet of the bi-directional packet flow, a set of auxiliary features comprising: an epoch time (Limb: col. 17 line 62-col. 18 line 9 “The method 400 may include, at action 408, capturing target payload data and target time data from a target flow of network packets between a target client application and a target server application. … the target time data may indicate time periods between arrivals of the network packets in the target flow. For example, the network analysis application 120 may capture, at action 408, target payload data (e.g., the payload data 114 n) and target time data (e.g., the time data 116 n) from a target flow of network packets (e.g., the flow of network packets 112 n) between a target client application (e.g., the client application 108 n) and a target server application (e.g., the server application 110 n).” timestamps of packets are epoch times, and are used to determine time data between packet arrivals.);
However Limb does not explicitly disclose capturing, for each packet of the bi-directional packet flow, a set of auxiliary features comprising: a source address; a destination address; a source port; a destination port; a protocol identifier; and a direction indicator.
Lim discloses capturing, for each packet of the bi-directional packet flow, a set of auxiliary features comprising: a source address; a destination address; a source port; a destination port; a protocol identifier; and a direction indicator (Lim: para.0061 “Each TCP segment is combined with IP and TCP headers to become a data frame called an IP packet. A sequence number of a TCP segment may be used to identify a packet order.” Header comprises information regarding protocol para.0097 “In a TCP stream split operation 301, traffic flow may be split according to each TCP stream. In an exemplary embodiment, all packets having the same pair of source/destination IP addresses and port numbers (4-tuple) for each TCP stream may belong to the same connection.” Source and destination address and ports are obtained to differentiate packets in a connection. Para.0116 “Series of records received by the client terminal 201 and the server 203 may be aggregated separately. Meanwhile, the order of parsed records may be chronologically retained to merge the two independent record sequences into a single sorted sequence.” As client received and server received records are aggregated separately, there is some indication based on destination of directionality of the packets.).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Limb with Lim in order to incorporate for each packet of the bi-directional packet flow, a set of auxiliary features comprising: a source address; a destination address; a source port; a destination port; a protocol identifier; and a direction indicator.
One of ordinary skill in the art would have been motivated to combine because of the expected benefit of improving security in the network (Lim: para.0008).
Regarding Claim 12, Limb-Lim-Nakamura discloses claim 11 as set forth above.
Limb further discloses wherein: the bi-directional packet flow is determined and maintained using the set of auxiliary features (Limb: col. 6 lines 50-55 Fig. 1 “More particularly, the network analysis application 120 may be configured to monitor flows of network packets 112 a-112 n between the clients 104 a-104 n and the servers 106 a-106 n in order to capture payload data 114 a-114 n and time data 116 a-116 n.” the auxiliary information is general packet header information and is used in the bidirectional flow, i.e. maintained between client and server as it is used to forward packets between devices.);
the set of auxiliary features is excluded from the first one-dimensional pixel sequence and the second one-dimensional pixel sequence (Limb: col. 17 line 62-col. 18 line 9 “The method 400 may include, at action 408, capturing target payload data and target time data from a target flow of network packets between a target client application and a target server application. … the target time data may indicate time periods between arrivals of the network packets in the target flow. For example, the network analysis application 120 may capture, at action 408, target payload data (e.g., the payload data 114 n) and target time data (e.g., the time data 116 n) from a target flow of network packets (e.g., the flow of network packets 112 n) between a target client application (e.g., the client application 108 n) and a target server application (e.g., the server application 110 n).” Col. 7 line 47-67 “The method 200 may include, at action 214, combining the normalized payload data with the normalized time data into a set of combined data points. In some embodiments, the combining of the normalized payload data with the normalized time data into the set of combined data points may include interleaving the normalized payload data and the normalized time data into an array of the set of combined data points.” timestamps of packets are epoch times, and are used to determine time data between packet arrivals. The time data, that is derived from the epoch time, is then used in combination with the payload data, i.e. none of the data from the headers, to generate a payload time diff array in step 214, which comprises multiple one dimensional packet sequences of p,t).
However Limb-Lim does not explicitly disclose the bi-directional packet flow is subject to an epoch-based ordering using the epoch time of each packet.
Nakamura discloses the bi-directional packet flow is subject to an epoch-based ordering using the epoch time of each packet (Nakamura: para.0052 “As a procedure of the image conversion, first, data of 100 packets is arranged in chronological order to generate a 100 × 5 array. The generated array of 100 × 5 is defined as integrated feature information. At this time, the value of each element of the array corresponding to the packet is normalized between 0 and 1 for each field (column). For example, in the case of a time stamp, the arrival time of the first arriving packet is set to 0, and the arrival time of the last arriving packet is set to 1.” The timestamp information, i.e. epoch time, is normalized and used to order the packets in chronological order in the image.).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Limb-Lim with that of Nakamura in order to incorporate the bi-directional packet flow is subject to an epoch-based ordering using the epoch time of each packet.
One of ordinary skill in the art would have been motivated to combine because of the expected benefit of identifying patterns in malicious attacks in the packets (Nakamura: para.0031, para.0052, para.0072).
Regarding Claim 13, Limb-Lim-Nakamura discloses claim 12 as set forth above.
Limb further discloses determining a first delta time comprises determining an epoch difference between the incoming packet and a first previous packet in the bi-directional packet flow under the epoch-based ordering (Limb: col. 9 line 27-40 “At action 206, the network analysis application 120 may convert the time periods between the arrivals of the network packets in the flow to positive Float64 time period values. This action may be represented by the code: timezero=np.abs(np.asarray(time_raw, dtype=‘float64’)), and may result in values as follows:” time differences between each packet is determined.); and
determining a second delta time comprises determining an epoch difference between the outgoing packet and a second previous packet in the bi-directional packet flow under the epoch- based ordering (Limb: col. 9 line 27-40 “At action 206, the network analysis application 120 may convert the time periods between the arrivals of the network packets in the flow to positive Float64 time period values. This action may be represented by the code: timezero=np.abs(np.asarray(time_raw, dtype=‘float64’)), and may result in values as follows:” time differences between each packet is determined. Seen in Fig. 1, these are bidirectional packets.).
Regarding Claim 14, Limb-Lim-Nakamura discloses claim 13 as set forth above.
Limb further discloses generating the first one-dimensional pixel sequence comprises including a first delta-time pixel generated by converting the first delta time to a channel value (Limb: fig. 2 steps 208-212 col. 9 line 40-col 10 line 28 “At action 208, the network analysis application 120 may apply a Log Base 2 transformation to each of the positive Float64 time period values to generate first normalized time period values. This action may be represented by the code: p.log 2(timezero, out=timezero), and may result in values as follows:… At action 212, the network analysis application 120 may pad each of the second normalized time period values to four digits, split each of the four digits into single-digit integers, and multiply each of the single-digit integers by 25.5. This action may be represented by the code: timediff=[round(25.5*int(x)) for n in timezero for x in str(int(n)).zfill(4)],” col. 17 lines 5-10 “Although the images 300 a-300 c are illustrated in FIGS. 3A-3C as lossless grayscale images, it is understood that the images 300 a-300 c may instead be color images “ each time difference value is converted into a value that is used to be inserted into the matrix in steps 214-216 that is converted into a pixel of an image. Therefore this conversion converts the time into a channel value used to select the color for the pixel); and
generating the second one-dimensional pixel sequence comprises including a second delta-time pixel generated by converting the second delta time to a channel value (Limb: fig. 2 steps 208-212 col. 9 line 40-col 10 line 28 “At action 208, the network analysis application 120 may apply a Log Base 2 transformation to each of the positive Float64 time period values to generate first normalized time period values. This action may be represented by the code: p.log 2(timezero, out=timezero), and may result in values as follows:… At action 212, the network analysis application 120 may pad each of the second normalized time period values to four digits, split each of the four digits into single-digit integers, and multiply each of the single-digit integers by 25.5. This action may be represented by the code: timediff=[round(25.5*int(x)) for n in timezero for x in str(int(n)).zfill(4)],” col. 17 lines 5-10 “Although the images 300 a-300 c are illustrated in FIGS. 3A-3C as lossless grayscale images, it is understood that the images 300 a-300 c may instead be color images “ each time difference value is converted into a value that is used to be inserted into the matrix in steps 214-216 that is converted into a pixel of an image. Therefore this conversion converts the time into a channel value used to select the color for the pixel).
Regarding Claim 15, Limb-Lim-Nakamura discloses claim 13 as set forth above.
Limb further discloses wherein the second previous packet is the incoming packet (Limb: col. 9 line 27-40 “At action 206, the network analysis application 120 may convert the time periods between the arrivals of the network packets in the flow to positive Float64 time period values. This action may be represented by the code: timezero=np.abs(np.asarray(time_raw, dtype=‘float64’)), and may result in values as follows:” time differences between consecutive packets are determined, therefore the second packet may have its time compared to the incoming packet.).
Claim(s) 16, 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Limb (US 11,159,560 B1) in view of Lim et al. (hereinafter Lim, US 2022/0174083 A1) in view of Nakamura et al. (hereinafter Nakamura, WO 2019240054 A1) in view of El-Moussa et al. (hereinafter Moussa, Us 2018/0115567 A1).
Regarding Claim 16, Limb-Lim-Nakamura discloses claim 11 as set forth above.
However Limb-Lim does not explicitly disclose wherein generating the image further comprises populating at least one pixel position of a third color channel of the image with auxiliary- information channel values derived from one or more of the auxiliary features.
Nakamura discloses wherein generating the image further comprises populating at least one pixel position of the image with auxiliary- information channel values derived from one or more of the auxiliary features (Nakamura: para.0049 “FIG. 6 is a diagram for explaining image conversion for one packet in the embodiment of the present invention. In the present embodiment, five types of fields of a time at which a packet is received (timestamp), a transmission source port number (Src Port), a destination port number (Dst Port), a sequence number (Seq), and a window size (Win) are set as imaging targets, and are handled as a 5 ? 1 array having values corresponding to these fields as elements. These elements are selected as possible features for classifying the transmission source of the packet, and other elements may be used. A 5 × 1 array corresponding to one packet is used as feature information. Further, a field to be imaged is one pixel, and one packet is an image of 1 × 5 pixels as shown in FIG. 6.” Fig. 6 shows each of the auxiliary information represented as pixels).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Limb-Lim with that of Nakamura in order to incorporate wherein generating the image further comprises populating at least one pixel position of the image with auxiliary- information channel values derived from one or more of the auxiliary features.
One of ordinary skill in the art would have been motivated to combine because of the expected benefit of identifying patterns in malicious attacks in the packets (Nakamura: para.0031, para.0052, para.0072).
However Limb-Lim-Nakamura does not explicitly disclose wherein generating the image further comprises populating at least one pixel position of a third color channel of the image with auxiliary- information channel values derived from one or more of the auxiliary features.
Moussa discloses wherein generating the image further comprises populating at least one pixel position of a third color channel of the image with auxiliary- information channel values derived from one or more of features (Moussa: para.0120 “The profile image 1900 is a matrix such as a raster or bitmapped image in which each pixel is represented by a combination of red (R) 1902, green (G) 1904 and blue (B) 1908 components such that each pixel has a resulting red/green/blue (RGB) color, each color component having a range 0 to 255. The particular values of R, G and B correspond to different profile attributes for a byte in the connection setup portion to which a pixel corresponds.” Each pixel corresponds to a byte of data from the network traffic, and information regarding the packet information is in each color channel of the pixel).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Limb-Lim-Nakamura with Moussa in order to incorporate generating the image further comprises populating at least one pixel position of a third color channel of the image with auxiliary- information channel values derived from one or more of features, and apply this concept to the information that is represented as pixels in that of Nakamura.
One of ordinary skill in the art would have been motivated to combine because of the expected benefit of improved data representation for detection of malware in traffic (Moussa: para.0120).
Regarding Claim 20, Limb-Lim-Nakamura discloses claim 10 as set forth above.
However Limb-Lim does not explicitly disclose for a line corresponding to an incoming packet, the first color channel of the line is populated with the first one-dimensional pixel sequence and the second color channel of the line is set to zero; and for a line corresponding to an outgoing packet, the first color channel of the line is set to zero and the second color channel of the line is populated with the second one-dimensional pixel sequence.
Nakamura discloses a line corresponding to an incoming packet; and a line corresponding to an outgoing packet (Nakamura: para.0049 “FIG. 6 is a diagram for explaining image conversion for one packet in the embodiment of the present invention. In the present embodiment, five types of fields of a time at which a packet is received (timestamp), a transmission source port number (Src Port), a destination port number (Dst Port), a sequence number (Seq), and a window size (Win) are set as imaging targets, and are handled as a 5 ? 1 array having values corresponding to these fields as elements.” Para.0052 “In this way, for example, when the same port number is continuously designated, the same color is obtained, and when the port number is periodically changed, a specific pattern is formed in the image, so that the feature after imaging becomes clear. In addition, the color arrangement of each pixel is a gray scale as described above, and in the present embodiment, when the value after normalization is 0, the color is black, and when the value is 1, the color is white. In this embodiment, an image is generated by generating a 100 × 5 array and then transposing the array. Thus, an image of 5 × 100 pixels is generated.” See Fig. 6-8. The array for each packet is generated to include destination port as a particular color, i.e. determine color channel information in order to establish a visible color change sequence based on changes in source and destination, i.e. incoming or outgoing status represented as ports.).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Limb-Lim with that of Nakamura in order to incorporate a line corresponding to an incoming packet; and a line corresponding to an outgoing packet.
One of ordinary skill in the art would have been motivated to combine because of the expected benefit of identifying patterns in malicious attacks in the packets (Nakamura: para.0031, para.0052, para.0072).
However Limb-Lim-Nakamura does not explicitly disclose for a line corresponding to an incoming packet, the first color channel of the line is populated with the first one-dimensional pixel sequence and the second color channel of the line is set to zero; and for a line corresponding to an outgoing packet, the first color channel of the line is set to zero and the second color channel of the line is populated with the second one-dimensional pixel sequence.
Moussa discloses for a line corresponding to a packet, the first color channel of the line is populated with the first one-dimensional pixel sequence and the second color channel of the line is set to zero (Moussa: Fig. 19 a-b para.0120 “The profile image 1900 is a matrix such as a raster or bitmapped image in which each pixel is represented by a combination of red (R) 1902, green (G) 1904 and blue (B) 1908 components such that each pixel has a resulting red/green/blue (RGB) color, each color component having a range 0 to 255. The particular values of R, G and B correspond to different profile attributes for a byte in the connection setup portion to which a pixel corresponds.” The RGB channel values are set to particular values related to the bytes of each packet, therefore the bytes of a flow may have no affinity towards a particular attribute and may be set to zero, whereas it may have affinity towards another attribute and comprise non zero values.); and
for a line corresponding to a packet, the first color channel of the line is set to zero and the second color channel of the line is populated with the second one-dimensional pixel sequence (Moussa: fig. 19 a-bpara.0120 “The profile image 1900 is a matrix such as a raster or bitmapped image in which each pixel is represented by a combination of red (R) 1902, green (G) 1904 and blue (B) 1908 components such that each pixel has a resulting red/green/blue (RGB) color, each color component having a range 0 to 255. The particular values of R, G and B correspond to different profile attributes for a byte in the connection setup portion to which a pixel corresponds.” Similarly, a second row of values may correspond to attributes the first row did not, but has no affinity towards attributes the first row has values towards, causing the opposite attributes to be set to zero.).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Limb-Lim-Nakamura with Moussa in order to incorporate for a line corresponding to a packet, the first color channel of the line is populated with the first one-dimensional pixel sequence and the second color channel of the line is set to zero; and for a line corresponding to a packet, the first color channel of the line is set to zero and the second color channel of the line is populated with the second one-dimensional pixel sequence, and apply this color channel encoding of attributes of a packet to that of Limb-Lim-Nakamura.
One of ordinary skill in the art would have been motivated to combine because of the expected benefit of improved data representation for detection of malware in traffic (Moussa: para.0120).
Claim(s) 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Limb (US 11,159,560 B1) in view of Lim et al. (hereinafter Lim, US 2022/0174083 A1) in view of Nakamura et al. (hereinafter Nakamura, WO 2019240054 A1) in view of Nagai et al. (hereinafter Nagai, US 2023/0412624 A1) in view of El-Moussa et al. (hereinafter Moussa, Us 2018/0115567 A1).
Regarding Claim 17, Limb-Lim-Nakamura discloses claim 10 as set forth above.
Limb further discloses wherein generating the first one-dimensional pixel sequence and generating the second one-dimensional pixel sequence (Limb: Col. 7 line 47-67 “The method 200 may include, at action 214, combining the normalized payload data with the normalized time data into a set of combined data points. In some embodiments, the combining of the normalized payload data with the normalized time data into the set of combined data points may include interleaving the normalized payload data and the normalized time data into an array of the set of combined data points.” Seen in Fig. 2 step 214, wherein the payload data and time data are combined. Each pair of payload and time data is a one dimensional pixel sequence for a packet, a pair for one of the incoming data packets is a first pixel sequence.)
However Limb-Lim-Nakamura does not explicitly disclose wherein generating the first one-dimensional pixel sequence and generating the second one-dimensional pixel sequence each comprise mapping each hexadecimal byte value of a respective sequence of bytes to a corresponding intensity value for a respective color channel of the image.
Nagai discloses mapping each hexadecimal byte value of a respective sequence of bytes to a corresponding intensity value (Nagai: para.0022 “For example, the feature quantity generation unit 122 regards the payload of each packet as a hexadecimal number byte string and transforms each byte into a decimal number to generate the feature quantity.” The packet data is in hexadecimal format.).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Limb-Lim-Nakamura with Nagai in order to incorporate mapping each hexadecimal byte value of a respective sequence of bytes to a corresponding intensity value.
One of ordinary skill in the art would have been motivated to combine because of the expected benefit of known efficiency of data representation using hexadecimal notation (Nagai: para.0022).
However Limb-Lim-Nakamura-Nagai does not explicitly disclose wherein generating the first one-dimensional pixel sequence and generating the second one-dimensional pixel sequence each comprise mapping each hexadecimal byte value of a respective sequence of bytes to a corresponding intensity value for a respective color channel of the image.
Moussa discloses wherein generating the first one-dimensional pixel sequence and generating the second one-dimensional pixel sequence each comprise mapping each byte value of a respective sequence of bytes to a corresponding intensity value for a respective color channel of the image. (Moussa: para.0119 “ In one embodiment colors of pixels can be employed and/or pixel intensity attributes to represent profiles for bytes in connection setup portions such as particular values of coefficient/entropy, median values, deviation, averages, means, modes, ranges, minima, maxima and the like.”para.0120 “The profile image 1900 is a matrix such as a raster or bitmapped image in which each pixel is represented by a combination of red (R) 1902, green (G) 1904 and blue (B) 1908 components such that each pixel has a resulting red/green/blue (RGB) color, each color component having a range 0 to 255. The particular values of R, G and B correspond to different profile attributes for a byte in the connection setup portion to which a pixel corresponds….For example, a first byte might typically exhibit normalized coefficient/entropy values about a mean value of 100 with a maximum deviation of 10. Such a byte can be profiled by simply identifying the mean and deviation values. A profile employing a mean and deviation value where a pixel R value corresponds to the mean and a pixel G value corresponds to the deviation can be identified as profile 1. The profile identifier can itself be encoded in the pixel B value so that, at runtime, the meaning of the R and G values can be determined. Other alternative profile types can therefore be employed within the same profile matrix (image), distinguished by the pixel B value itself.” Based on the values of each byte, a particular color and intensity for each RBG value can be set.).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Limb-Golic with Moussa in order to incorporate generating the first one-dimensional pixel sequence and generating the second one-dimensional pixel sequence each comprise mapping each byte value of a respective sequence of bytes to a corresponding intensity value for a respective color channel of the image.
One of ordinary skill in the art would have been motivated to combine because of the expected benefit of improved data representation for detection of malware in traffic (Moussa: para.0120).
Claim(s) 18-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Limb (US 11,159,560 B1) in view of Lim et al. (hereinafter Lim, US 2022/0174083 A1) in view of Nakamura et al. (hereinafter Nakamura, WO 2019240054 A1) in view of Nagai et al. (hereinafter Nagai, US 2023/0412624 A1) in view of El-Moussa et al. (hereinafter Moussa, Us 2018/0115567 A1) in view of Valisenko et al. (hereinafter Val, US 2023/0022279 A1).
Regarding Claim 18, Limb-Lim-Nakamura discloses claim 10 as set forth above.
However Limb-Lim does not explicitly disclose wherein generating the image comprises iteratively generating updated images as additional packets arrive until the image includes P lines corresponding to P packets of the bi-directional packet flow.
Nakamura discloses wherein generating the image comprises the image includes P lines corresponding to P packets of the bi-directional packet flow (Nakamura: para.0052 “As a procedure of the image conversion, first, data of 100 packets is arranged in chronological order to generate a 100 × 5 array.” Wherein p is 100 lines of 100 packets.).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Limb-Lim with that of Nakamura in order to incorporate generating the image comprises the image includes P lines corresponding to P packets of the bi-directional packet flow.
One of ordinary skill in the art would have been motivated to combine because of the expected benefit of identifying patterns in malicious attacks in the packets (Nakamura: para.0031, para.0052, para.0072).
However Limb-Lim-Nakamura does not explicitly disclose wherein generating the image comprises iteratively generating updated images as additional packets arrive until the image includes P lines corresponding to P packets of the bi-directional packet flow.
Val discloses iteratively analyzing a sliding window of packets corresponding to P packets of the packet flow (Val: para.0024 “The data object 102 is composed of a sequence of multiple data packets that are passing through the wire. In this specification, a data packet is any piece of data communicated over a network that partially or wholly represents a data object.” Para.0025 “The intrusion detection system can then iteratively analyze a sliding window of one or more of the stream objects of the data object 102 to determine whether the data object 102 is malicious.” A set of packets set to one or more packets, is analyzed in a sliding window, i.e. iteratively analyzing a number of packets at a time in sequence, to determine maliciousness of the set of packets.).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Limb-Lim-Nakamura in order to incorporate iteratively analyzing a sliding window of packets corresponding to P packets of the packet flow, and apply this concept to the packet analysis of Limb-Lim-Nakamura that generates an image to analyze a packet flow, such that new images are generating in a sliding window format, such that a new image is generated for every P packets for analysis.
One of ordinary skill in the art before the effective filing date of the claimed invention to combine because of the expected benefit of effectively identifying signatures of malicious attacks (Val: para.0003, para.0030).
Regarding Claim 19, Limb-Lim-Nakamura-Val discloses claim 18 as set forth above.
Limb further discloses wherein processing the image using the trained convolutional-neural-network (CNN) classifier comprises, for each image, applying the image to the trained convolutional-neural-network (CNN) classifier to determine an updated likelihood of malicious activity (Limb: col. 18 lines 23-40 “The method 400 may include, at action 414, employing the trained convolutional neural network to determine an output including an extent to which the target image matches one of the training images in order to determine a likelihood that the target client application and/or the target server application matches one of the training client applications and/or one of the training server applications. … in order to determine a likelihood that the target client application (e.g., the client application 108 n) and/or the target server application (e.g., the server application 1110 n) matches one of the training client applications (e.g., the client applications 108 a-108 c) and/or one of the training server applications (e.g., the server applications 110 a-110 c). …For example, where at least one of the training client applications (e.g., client applications 108 a-108 c) and the training server applications (e.g., server application 110 a-110 c) is a known malicious application, the convolutional neural network 120 may have been trained to recognize the same or similar malicious application (e.g., a similar application may be slightly different, but a match above a threshold, such as 90%, may nevertheless identify the similar application as matching above a threshold, which may indicate that the malware is at least in the same malware family)” the image is processed to determine using a trained neural network, trained in step 406 Fig. 4, if the image matches a known malicious image).
However Limb-Lim-Nakamura does not explicitly disclose wherein processing the image using the trained convolutional-neural-network (CNN) classifier comprises, for each updated image, applying the updated image to the trained convolutional-neural-network (CNN) classifier to determine an updated likelihood of malicious activity.
Val discloses generating updated set of stream objects (Val: para.0024 “The data object 102 is composed of a sequence of multiple data packets that are passing through the wire. In this specification, a data packet is any piece of data communicated over a network that partially or wholly represents a data object.” Para.0025 “The intrusion detection system can then iteratively analyze a sliding window of one or more of the stream objects of the data object 102 to determine whether the data object 102 is malicious.” A set of packets set to one or more packets, is analyzed in a sliding window, i.e. iteratively analyzing a number of packets at a time in sequence, to determine maliciousness of the set of packets.).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Limb-Lim-Nakamura in order to incorporate updated set of stream objects, and apply this concept to the images generated in Limb-Lim-Nakamura such that images are generated in a sliding window fashion, and used in the CNN analysis of Limb.
One of ordinary skill in the art before the effective filing date of the claimed invention to combine because of the expected benefit of effectively identifying signatures of malicious attacks (Val: para.0003, para.0030).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Chen et al. Us 2019/0042745 A1, see Fig. 10, para.0032, para.0088 showing converting packets to pixels for malware detection, fig. 8 neural network.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EUI H KIM whose telephone number is (571)272-8133. The examiner can normally be reached 7:30-5 M-R, M-F alternating.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamal B Divecha can be reached at 5712725863. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/EUI H KIM/ Examiner, Art Unit 2453
/KAMAL B DIVECHA/ Supervisory Patent Examiner, Art Unit 2453