Last updated: May 29, 2026

Application No. 18/310,812

NEW ENTITY DETECTION USING PROBABILISTIC DATA STRUCTURES

Non-Final OA §102§103

Filed

May 02, 2023

Examiner

DOAN, HIEN VAN

Art Unit

2449

Tech Center

2400 — Computer Networks

Assignee

Microsoft Technology Licensing, LLC

OA Round

1 (Non-Final)

Interview Optional

— +34.0% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 51% grant rate with +34.0% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 178 resolved cases, 2023–2026

Examiner Intelligence

DOAN, HIEN VAN View full profile →

Grants 51% of resolved cases

Career Allowance Rate

90 granted / 178 resolved

-7.4% vs TC avg

Strong +34% interview lift

Without

With

+34.0%

Interview Lift

resolved cases with interview

Typical timeline

4y 2m

Avg Prosecution

9 currently pending

Career history

196

Total Applications

across all art units

Statute-Specific Performance

§101

1.7%

-38.3% vs TC avg

§103

89.6%

+49.6% vs TC avg

§102

7.3%

-32.7% vs TC avg

§112

0.9%

-39.1% vs TC avg

Black line = Tech Center average estimate • Based on career data from 178 resolved cases

Office Action

§102 §103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim status: claims 1-20 are pending in this Office Action.

DETAILED ACTION

Claim Objections
Claims 7 and 14 objected to because of the following informalities:  
Regarding to claims 7, and 14: 	The phrase "the probabilistic data structure" should be -- the first probabilistic data structure --. Appropriate correction is required. 

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-4, 6-11, 13-18, and 20 are rejected under 35 U.S.C. 102(b) as being anticipated by Natarajan (US 20150319182 A1) hereafter referred to as “Natarajan”.

Regarding to claim 1:
Natarajan discloses A method of intrusion detection comprising: 
detecting a network event associated with a resource ([0047] identify the Uniform Resource Locator (URL) address of URL requests as the information key and hash the URL address; or may identify the file name and the file size of an executable file information key and hash the file name and file size of the executable file. [0048] queries where the answer to a request for information is usually positive. Such instances may include, for example, whether content at a given URL has been scanned for inappropriate (e.g., pornographic) content)
determining a first identifier associated with the detected network event ([0047] identify the Uniform Resource Locator (URL) address of URL requests as the information key and hash the URL address; or may identify the file name and the file size of an executable file information key and hash the file name and file size of the executable file. [0048] whether a given file has been virus scanned, whether content at a given URL has been scanned for inappropriate (e.g., pornographic) content); 
performing a lookup of an element representing the first identifier in a first probabilistic data structure (fig. 1[0047] the detection processing filter 112 may be used as the first stage of an information lookup procedure … identify the Uniform Resource Locator (URL) address of URL requests as the information key and hash the URL address … The information key is hashed to generate an index value (i.e., a bit position). A value of zero in a bit position in the guard table can indicate, for example, absence of information, while a one in that bit position can indicate presence of information … identify the Uniform Resource Locator (URL) address of URL requests as the information key and hash the URL address. [0048] the detection processing filter 112 may be a Bloom filter implemented by a single hash function. The Bloom filter may be sparse table, i.e., the tables include many zeros and few ones, and the hash function is chosen to minimize or eliminate false negatives which are, for example, instances where an information key is hashed to a bit position. Note: table with zeros and ones is probabilistic data structure; See spec [0019] PDSs represent lists of entities as probabilistic data structures. Adding new entities turns on (e.g., sets from “0” to “1”) different bits in the data structure)
determining the first identifier does not exist in the first probabilistic data structure based on at least the element returned by the lookup ([0047] an information lookup procedure … The information key is hashed to generate an index value (i.e., a bit position). A value of zero in a bit position in the guard table can indicate, for example, absence of information, while a one in that bit position can indicate presence of information. Alternatively, a one could be used to represent absence, and a zero to represent presence. Each content item may have an information key that is hashed. For example, the processing node manager 118 may identify the Uniform Resource Locator (URL) address of URL requests as the information key and hash the URL address … [0050] The processing node 110 may, for example, use the information in the local detection processing filter 112 to quickly determine the presence and/or absence of information, e.g., whether a particular URL has been checked for malware; whether a particular executable has been virus scanned, etc …  the master threat data 124 may be distributed to the processing nodes 110, which then store a local copy of the threat data 114. Note: generate an index value (zero) is returned by the lookup; absence of information (URL address) is the first identifier does not exist); and 
performing a first action in response to the determination that the first identifier does not exist in the first probabilistic data structure ([0048] Thus, if the detection processing filter 112 indicates that the content item has not been processed, then a worst case null lookup operation into the threat data 114 is avoided, and a threat detection can be implemented immediately.  [0050] The processing node 110 may, for example, use the information in the local detection processing filter 112 to quickly determine the presence and/or absence of information, e.g., whether a particular URL has been checked for malware; whether a particular executable has been virus scanned, etc …  the master threat data 124 may be distributed to the processing nodes 110, which then store a local copy of the threat data 114 [0051] If, however, the content item is determined to not be classified by the threat data 114, then the processing node manager 118 may cause one or more of the data inspection engines 117 to perform the threat detection processes to classify the content item according to a threat classification)

Regarding to claim 2:
Natarajan discloses The method of claim 1, wherein said performing a lookup comprises: 
generating a hash of the first identifier ([0047] the detection processing filter 112 may be used as the first stage of an information lookup procedure … The information key is hashed to generate an index value … identify the Uniform Resource Locator (URL) address of URL requests as the information key and hash the URL address); and 
performing the lookup on the first probabilistic data structure using the generated hash as an index key ([0047] the detection processing filter 112 may be used as the first stage of an information lookup procedure … The information key is hashed to generate an index value (i.e., a bit position). A value of zero in a bit position in the guard table can indicate, for example, absence of information.  Note: table with bits of zeros and ones is probabilistic data structure)

Regarding to claim 3:
Natarajan discloses The method of claim 1, further comprising: 
determining whether a second identifier exists in a second probabilistic data structure (Fig. 2 [0084] the IPs and URLs. [0100] packets having IP address, DNS name or URL matching the search string. [0047] The information key is hashed to generate an index value (i.e., a bit position). A value of zero in a bit position in the guard table can indicate, for example, absence of information, while a one in that bit position can indicate presence of information … identify the Uniform Resource Locator (URL) address of URL requests as the information key and hash the URL address; or may identify the file name and the file size of an executable file information key and hash the file name and file size of the executable file [0050] the detection processing filter 122 may be a guard table … the master threat data 124 may classify content items by threat classifications, e.g., a list of known viruses, a list of known malware sites, spam email domains, list of known or detected phishing sites, etc … Other remedial actions and processes may also be facilitated by the authority node 120. Note: a table 122 (comprise zeros and ones) is a second probabilistic data structure) ; and
performing a second action in response to determining that the first identifier does not exist in the first probabilistic data structure ([0048] Thus, if the detection processing filter 112 indicates that the content item has not been processed (not checked for malware, virus scane), then a worst case null lookup operation into the threat data 114 is avoided, and a threat detection can be implemented immediately [0047] an information lookup procedure … The information key is hashed to generate an index value (i.e., a bit position). A value of zero in a bit position in the guard table can indicate, for example, absence of information,  … may identify the Uniform Resource Locator (URL) address of URL requests as the information key and hash the URL address [0050] determine the presence and/or absence of information, e.g., whether a particular URL has been checked for malware; whether a particular executable has been virus scanned, etc. [0032] cloud-based security system to sandbox unknown content (which can also be referred to as BA content) in the cloud, to install the unknown content for observation and analysis, and to leverage the results in the cloud for near immediate protection from newly detected malware. Note: install the unknown content for observation and analysis, and to leverage the results in the cloud for near immediate protection from newly detected malware is performing a second action) and that the second identifier exists in the second probabilistic data structure. ([0084] the IPs and URLs [0047] A value of zero in a bit position in the guard table can indicate, for example, absence of information, while a one in that bit position can indicate presence of information … identify the file name and the file size of an executable file information key and hash the file name and file size of the executable file [0049] The detection processing filter 122 may include data indicating whether content items have been processed (checked for malware, virus scane). [0050] the detection processing filter 122 may be a guard table … the master threat data 124 may classify content items by threat classifications, e.g., a list of known viruses, a list of known malware sites, spam email domains, list of known or detected phishing sites, etc … Other remedial actions and processes may also be facilitated by the authority node 120. Note: table 122 (see fig.2) is second probabilistic data structure; file name or URLs have been processed for malware/virus and threat classification is the second identifier exists in the second probabilistic data structure)

Regarding to claim 4:
 	Natarajan discloses The method of claim 2, wherein the first probabilistic data structure is associated with a first identifier type ([0048] The detection processing filter 112 thus improves performance of queries where the answer to a request … whether content at a given URL has been scanned … the detection processing filter 112 may be a Bloom filter implemented by a single hash function) and the second probabilistic data structure is associated with a second identifier type ([0047] identify the Uniform Resource Locator (URL) address of URL requests as the information key and hash the URL address; or may identify the file name and the file size of an executable file information key. [0050] the detection processing filter 122 may be a guard table … the master threat data 124 may classify content items by threat classifications, e.g., a list of known viruses, a list of known malware sites, spam email domains, list of known or detected phishing sites, etc. [0100] packets having IP address, DNS name or URL matching the search string), and wherein the first identifier type or the second identifier type comprise: 
a username ([0043] user login credentials for registered users of the enterprise 200 system. Such credentials may include a user identifiers, login passwords, and a login history associated with each user identifier); 
an access token; 
a device identifier; 
an application identifier; 
a query identifier (0047] identify the Uniform Resource Locator (URL) address of URL requests. [0048] queries where the answer to a request for information is usually positive. Such instances may include, for example, whether content at a given URL has been scanned); or 
a process identifier (0047] identify the Uniform Resource Locator (URL) address of URL requests.  

Regarding to claim 6:
Natarajan discloses The method of claim 1, wherein the first action comprises at least one of ([0048] Thus, if the detection processing filter 112 indicates that the content item has not been processed, then a worst case null lookup operation into the threat data 114 is avoided, and a threat detection can be implemented immediately.): 
logging the network event ([0039 receives user requests to external servers [0043] user login credentials for registered users of the enterprise 200 system. Such credentials may include a user identifiers, login passwords, and a login history associated with each user identifier. [0047] identify the Uniform Resource Locator (URL) address of URL requests); 
providing an alert or a notification to at least one of an owner or administrator of the resource; denying access to the resource; 
dropping traffic associated with the network event; 
rerouting traffic associated with the network event; 
isolating traffic associated with the network event; or 
terminating a connection associated with the network event.  

Regarding to claim 7:
Natarajan discloses The method of claim 1, wherein the probabilistic data structure comprises at least one of: 
a Bloom filter ([0048] the detection processing filter 112 may be a Bloom filter implemented by a single hash function. The Bloom filter may be sparse table, i.e., the tables include many zeros and few ones); 
a counting Bloom filter; 
a Ribbon filter; 
an XOR filter; or 
a cuckoo filter.  

Regarding to claim 8:
Natarajan discloses A system for intrusion detection comprising: 
a processor ([0005] a processor); and 
a memory device that stores program code structured to cause the processor to ([0005] a processor communicatively coupled to the network interface and the data store; memory storing instructions that, when executed, cause the processor to): 
[Rejection rationale for claim 1 is applicable].

Regarding to claim 9:
[Rejection rationale for claim 2 is applicable].

Regarding to claim 10:
[Rejection rationale for claim 3 is applicable].

Regarding to claim 11:
[Rejection rationale for claim 4 is applicable].

Regarding to claim 13:
[Rejection rationale for claim 6 is applicable].

Regarding to claim 14:
[Rejection rationale for claim 7 is applicable].

Regarding to claim 15:
Natarajan discloses A computer-readable storage medium comprising computer-executable instructions, that when executed by a processor, cause the processor to ([0005] a processor communicatively coupled to the network interface and the data store; memory storing instructions that, when executed, cause the processor to): 
[Rejection rationale for claim 1 is applicable].

Regarding to claim 16:
[Rejection rationale for claim 2 is applicable].

Regarding to claim 17:
[Rejection rationale for claim 3 is applicable].

Regarding to claim 18:
[Rejection rationale for claim 4 is applicable].

Regarding to claim 20:
[Rejection rationale for claim 6 is applicable].

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
	
 	The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1,148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under pre- AIA  35 U.S.C. 103(a) are summarized as follows: 	1. Determining the scope and contents of the prior art. 	2. Ascertaining the differences between the prior art and the claims at issue. 	3. Resolving the level of ordinary skill in the pertinent art. 	4. Considering objective evidence present in the application indicating obviousness or nonobviousness. 

Claims 5, 12, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Natarajan (US 20150319182 A1), in view of Martin (US20180004942 A1)

Regarding to claim 5:
Natarajan teaches The method of claim 1, 
Natarajan does not explicitly disclose inserting the first identifier into the first probabilistic data structure by: hashing the first identifier with a plurality of hash functions to determine a plurality of mapped elements of the first probabilistic data structure, wherein the plurality of hash functions comprise uniform and independent hash functions that map to different elements of the first probabilistic data structure, updating values of the determined plurality of mapped elements of the first probabilistic data structure based on said hashing the first identifier with the plurality of hash functions.
Martin teaches further comprising: inserting the first identifier into the first probabilistic data structure by: 
hashing the first identifier with a plurality of hash functions to determine a plurality of mapped elements of the first probabilistic data structure, wherein the plurality of hash functions comprise uniform and independent hash functions that map to different elements of the first probabilistic data structure (Fig. 1 [0033]  The system can also define a k-number of different hash functions —based on a target false positive rate (e.g., <1%)—to map elements from the network accounting log to the bit array [0034] the system can feed event metadata (e.g., an external IP address, a hostname, or a URL) of this event from the network accounting log to each of the k hash functions to calculate k array positions for this element and then set bits at each of these positions to 1 in order …. by feeding the threat element to each of the k hash functions to get k array positions [0032] in Block S120, the system can write metadata (IP address, URL or domain name, and hostname) of network events represented in the network accounting log to a probabilistic data structure; and 
updating values of the determined plurality of mapped elements of the first probabilistic data structure based on said hashing the first identifier with the plurality of hash functions ([0033] the system can generate an empty Bloom filter containing a bit array of m bits—all set to null—proportional to: a time window represented by the network accounting log (e.g., one year)  [0034] the system can feed event metadata (e.g., an external IP address, a hostname, or a URL) of this event from the network accounting log to each of the k hash functions to calculate k array positions for this element and then set bits at each of these positions to 1 in order.  Note: see fig. 1 entries of ones is updated on S120)
It would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to take the teachings of Martin and apply them on the teachings of Natarajan to further implement disclose inserting the first identifier into the first probabilistic data structure by hashing the first identifier with a plurality of hash functions to determine a plurality of mapped elements of the first probabilistic data structure, wherein the plurality of hash functions comprise uniform and independent hash functions that map to different elements of the first probabilistic data structure, updating values of the determined plurality of mapped elements of the first probabilistic data structure based on said hashing the first identifier with the plurality of hash functions.  One would be motivated to do so because in order to improve better system and method to provide k-number of different hash functions to map elements from the network accounting log to the bit array (Martin, [0033]).

Regarding to claim 12:
[Rejection rationale for claim 5 is applicable].

Regarding to claim 19:
[Rejection rationale for claim 5 is applicable].

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HIEN DOAN whose telephone number is 571 272-4317.  The examiner can normally be reached on Monday-Thursday and biweekly Friday 9am-6pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, VIVEK SRIVASTAVA can be reached on (571)272-7304.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/HIEN V DOAN/Examiner, Art Unit 2449     

/VIVEK SRIVASTAVA/Supervisory Patent Examiner, Art Unit 2449

Read full office action

Prosecution Timeline

May 02, 2023

Application Filed

Feb 05, 2026

Non-Final Rejection mailed — §102, §103

Apr 01, 2026

Interview Requested

Apr 09, 2026

Examiner Interview Summary

Apr 09, 2026

Applicant Interview (Telephonic)

Apr 16, 2026

Response Filed

Precedent Cases

Applications granted by this same examiner with similar technology

15/925,258

Patent 12641141

IDENTIFYING AN HTTP RESOURCE USING MULTI-VARIANT HTTP REQUESTS

8y 2m to grant Granted May 26, 2026

16/264,292

Patent 12615317

INTERACTIVE CUSTOMIZED PUSH NOTIFICATIONS WITH CUSTOMIZED ACTIONS

7y 2m to grant Granted Apr 28, 2026

17/643,878

Patent 12542722

AUTOMATED INITIATION OF HELP SESSION IN A VIDEO STREAMING SYSTEM

4y 1m to grant Granted Feb 03, 2026

17/456,520

Patent 12470569

ANOMALY DETECTION RELATING TO COMMUNICATIONS USING INFORMATION EMBEDDING

3y 11m to grant Granted Nov 11, 2025

17/756,835

Patent 12443717

METHODS & PROCESSES TO SECURELY UPDATE SECURE ELEMENTS

3y 4m to grant Granted Oct 14, 2025

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

51%

Grant Probability

85%

With Interview (+34.0%)

4y 2m (~1y 1m remaining)

Median Time to Grant

Low

PTA Risk

Based on 178 resolved cases by this examiner. Grant probability derived from career allowance rate.