Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 21-40 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites creating a data set of entity records, applying a sequence of data‑security techniques to identify query terms and matching subsets, applying noise/threshold tests, and using the resulting set of query terms to perform digital content distribution.
The limitations of creating a data set of entity records and determining a first query term and a matching subset by applying a first data security technique, as drafted, are processes that, under their broadest reasonable interpretation, cover performance of the limitations as mental processes or mathematical concepts, apart from the recitation of generic computer components. For example, “creating a data set” and “determining a first query term” may be performed by a person assembling records and selecting candidate query terms based on observed attributes. Likewise, applying a data security technique such as k‑anonymity or differential privacy and checking whether an output satisfies a threshold are mathematical operations, manipulation of data, application of statistical transformations, and comparison against numeric thresholds, that can be expressed as abstract mathematical concepts or mental steps. Under the broadest reasonable interpretation, these steps could be performed on paper or mentally (i.e., by calculating anonymization groupings, adding noise, and evaluating whether noisy outputs meet a threshold), absent any concrete implementation details beyond generic computer use. Where a claim limitation, read broadly, covers performance in the human mind but for the recitation of generic computer hardware, it falls within the “Mathematical Concepts” and/or “Mental Processes” groupings of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claim only recites routine data processing steps performed on a generic computer environment, creating a data set, applying well‑known privacy techniques, adding noise, comparing against thresholds, and using selected query terms for content distribution. The claim does not recite any specific, non‑conventional way of implementing the privacy techniques, any improvement in the functioning of the computer itself, any specialized hardware, or any particular data structure or protocol that meaningfully limits the claimed steps. The recited elements describe the application of mathematical and statistical techniques to information and the use of the results for content selection, all implemented by conventional computing components. The mere recitation of applying k‑anonymity, differential privacy, adding noise, and performing comparisons, without describing a particular unconventional manner of performing those operations or tying them to a specific technical improvement in computer functionality, amounts to no more than instructions to apply the abstract idea using generic computer components. Thus, the additional elements do not integrate the abstract idea into a practical application and do not impose meaningful limitations on the abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above, the sequence of applying known privacy techniques and threshold tests, and then using selected terms for digital content distribution, are conventional data‑processing and information‑selection operations. The claim does not recite an inventive concept such as a novel algorithmic technique beyond routine statistical methods, nor does it recite specialized hardware or a technical improvement in computer performance or security that would transform the abstract idea into patent‑eligible subject matter. Merely performing the abstract idea on a generic computer or using standard privacy techniques and noise addition cannot supply the necessary “significantly more.” Accordingly, claim 21 is not patent eligible under 35 U.S.C. 101.
Claims 22–28, 29–36, and 37–40 recite substantially similar concepts in system and computer‑readable medium forms and are rejected for the same reasons as claim 21 under 35 U.S.C. 101.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Daub et al. (US 20210004864 A1):
[0030] FIG. 1 depicts a block diagram of an example implementation of a networked computer system 100. The system 100 includes a first data computing system 102, a second data computing system 104, and a third data computing system 106. The system 100 can also include a plurality of user devices 108a-108e (collectively referred to as user devices 108). The first, second, and third data computing systems 102, 104, and 106, and the user devices 108 can communicate over a network 110, which can include one or more of a local area networks, a wide area network, private networks, public networks, and the Internet. In some examples, the first data computing system 102 can be a content item (e.g., ads) provider that can provide content items for distribution and rendering on the user devices 108. The second data computing system 104 and the third data computing system can be content item distribution systems that distribute the content items to the user devices based on, for example, the content provided to the user devices. As an example, users on the user devices 108 can be provided with content such as, for example, web pages or audio-visual content. The content can include content item slots (e.g., positional or temporal) for displaying content items along with the content. The requests for displaying content items in the content item slots can be received by the content item distribution systems. The requests can include a user device identifier identifying the user device 108 and additional information related to the user device, the content provided to the user device 108, etc. The content item distribution system can utilize the information included in the content item request to select a content item, and provide the content item to the user device 108 to be rendered along with the provided content. The content item provided to the user device 108 can be part of a content item campaign run by, for example, the first data computing system 102. [0083] The addition of noise to the vectors, whether generated by the binomial vector or the vector of counts approach, can improve differential privacy of the user identifiers. The differential privacy of the binomial vectors approach and the vectors of counts approach discussed above can be achieved while sacrificing less accuracy that by previously existing differentially private cardinality estimators. [0119] In another non-limiting exemplary embodiment for implementing and testing various architectures, which does not limit the scope of the invention, the accuracy of the implementation is tested while varying scale of the Laplacian noise (b=1/ε). In the example embodiment described herein, both user identifier sets N.sub.1 (302) and N.sub.2 (304) have the same cardinality (N.sub.1=N.sub.2). The intersection of both sets (306) is fixed at one tenth of the size of N.sub.1. FIG. 12 shows a data plot obtained from the experiment implemented using this example embodiment. . . . The contours 1102, 1104, 1106, and 1108 shown in the plot in FIG. 11 show a constant standard error of 2% for different values of ε. The contour 1202 shows the threshold of 2% constant error when ε=2ln(3). The contour 1204 shows the threshold of 2% constant error when ε=sqrt(2)ln(3). The contour 1206 shows the threshold of 2% constant error when ε=ln(3). The contour 1208 shows the threshold of 2% constant error when ε=(1/sqrt(2))ln(3). The contour 1210 shows the threshold of 2% constant error when ε=(1/2)ln(3).
Wu et al. (US 12223078 B2):
(13) The concept related in the invention: PQL is the abbreviation of Protection Query Language, the parsing module encapsulates PQL statement with predetermined semantic and syntactic rules into corresponding SQL statement, after querying database, the differential privacy algorithm is combined to inject noise to the query results, which can effectively solve the problem of shared data privacy leakage to protect the query language. (14) In specific embodiment, the invention provides a privacy protection query language PQL and a system thereof, comprising PQL statement and system; the PQL statement comprises: PROTECT clause, PICK clause, WITH clause, WITHRANGE clause, GLOBAL clause, and WHERE clause; the system comprises a parsing module, a query module and a noise-injection module; the parsing module comprises a lexical analyzer and a syntactic analyzer; the user inputs the PQL statement according to predetermined semantic and syntactic rules and sends the same to the parsing module, after the parsing module receives the PQL statement, the PQL statement is checked for errors through the lexical analyzer, correct results are sent to the syntactic analyzer, otherwise, incorrect contents are pointed out; the syntactic analyzer performs grammatical and semantic checks on the PQL statement, generates a mapping table and a parameter table with the correct results, and sends to the query module and the noise-injection module respectively, otherwise, incorrect contents are pointed out; after receiving the mapping table, the query module encapsulates the same into an SQL statement, and verifies the encapsulated SQL statement, SQL statement is connected to the database for query, and final query results are sent to the noise-injection module; the noise-injection module obtains the final query results of the query module, calculates noise injection sensitivity according to the parameter table, and substitutes real query results, sensitivity, and privacy budget into underlying differential privacy algorithm function for noise injection, and then the results after noise injection is returned. the PROTECT clause and the PICK clause are first required clauses; the WITH clause and the WITHRANGE clause are second required clauses; the GLOBAL clause and the WHERE clause are optional clauses. the PQL statement also comprises aggregate functions comprising Avg[ ], Total[ ], Highest[ ], Lowest[ ] and Compute[ ]; wherein: the Avg[ ] is sequence average value function; the Total[ ] is sequence sum total function; the Highest[ ] is sequence maximum value function; the Lowest[ ] is sequence maximum value function; the Compute[ ] is sequence line number function.
Antonatos et al. (US 20190087604 A1):
[0015] Thus, as described herein, the present disclosure provides for a system to implement a differential privacy on clustered data that allows for the flexible preservation of information while maintaining rigorous mathematical guarantees of privacy. In one aspect, data anonymity is achieved on clustered data (e.g., k-anonymity), while providing information-loss measurements. In one aspect, a differential privacy may be applied on individual data points or datasets as distinct from statistics on the data. The differential privacy may be applied to any data source with both categorical and numerical attributes such as, for example, healthcare data, which includes categorical attributes (e.g., place of birth, race, gender) and numerical attributes (e.g., measurements such as height or weight). Thus, the application of differential privacy on clustered data ensures privacy by rendering different values statistically indistinguishable.
Peng et al. (US 20230144763 A1):
1. A method for generating a data structure for deduplicating data sets across a plurality of providers, the method comprising: maintaining, by a data processing system comprising one or more processors and a memory, in a database, a data set of records each identifying interactions between a plurality of users and a provider of the plurality of providers; initializing, by the data processing system, a plurality of vector data structures, wherein each vector data structure of the plurality of vector data structures corresponds to a respective frequency of a plurality of frequencies; determining, by the data processing system, for each user of the plurality of users, frequency data of the interactions between the user and the provider based on the data set of records, the frequency data of the user representing a number of the interactions between the user and the provider that have a target interaction type; updating, by the data processing system, the plurality of vector data structures based on the frequency data of each user of the plurality of users, wherein a first vector data structure of the plurality of vector data structures corresponds to a first frequency value of the plurality of frequencies and is updated to encode an identifier of a user of the plurality of users having frequency data indicating a number of interactions that matches the first frequency value, such that the plurality of vector data structures are differentially private; and sending, by the data processing system, the plurality of vector data structures to an analysis server for deduplication of the data set of records across the plurality of providers.
Wang et al. (US 20220376900 A1):
[0106] The aggregation server 180-B can perform the same computations and filtering as the aggregation server 180-A using the same data, to reach the same results. That is, the aggregation server 180-B can generate a table that matches Table 2 generated by the aggregation server 180-A.[0107] In some implementations, both aggregation servers 180-A and 180-B can perform some optional operations to ensure differential privacy, such as using subsampling. Differential privacy is a technique for sharing data about a dataset by describing the patterns of groups within the dataset without providing individual data in the dataset. To do this, each aggregation server can first sample the data (e.g., rows in Table 1) with some probability beta. Then, the aggregation server applies the k-anonymity generalization described above to the sampled data only. That is, the aggregation server can determine the number of unique encrypted blindly signed keys ( (BlindlySignedKey)) for each type of occurrence and filter from the sampled data the ones that do not meet the k-anonymity thresholds. [0108] For subsampling, to ensure that both aggregation servers replicate exactly the same sampling and perform the same differential privacy technique, the aggregation servers can use pseudorandomness-based sampling. The randomization for the sampling can be determined from a common seed that is collaboratively determined by the two servers (e.g., using a Diffie-Hellman key exchange). The result of the exchange is a seed for the same pseudorandom generator (e.g., one based on the Advanced Encryption Standard (AES) algorithm). This ensures that the same subset is sampled in both aggregation servers and the same results will be computed since once pseudorandomness is the same, the rest of the process is deterministic. [0109] After the joining and filtering, both aggregation servers 180-A and 180-B have the same data, e.g., a table or list of tuples that each include {count, ImpressionData.sub.3′, ConversionData.sub.3′}. The count is the number of conversions for impressions having the same impression data represented by ImpressionData.sub.3′.
Marathe et al. (US 20230047092 A1):
[0049] In some embodiments, the clipped parameter updates may then have noise added by a noise injecting component 217. This noise may be calibrated according to the same global clipping threshold 204 parameter provided by the aggregation server 200 such that the noise injected is calibrated to match a privacy loss bound specified by the aggregation server in some embodiments or a privacy loss bound dictated by the local machine learning system's choice of a privacy loss (upper) bound. This privacy loss bound may enforce differential privacy guarantees for the client's local dataset without coordination of the aggregation server 200.
Ragothaman et al. (US 20220247818 A1):
[0094] The database retrieves the result data set (or query result set) that matches the query and sends the associated identifier of the transaction and optionally the result data set back to the node 120. The identifier is sent back as a regular signed transaction which is processed by the node 120 in the same manner as it receives the transaction from the user terminal 106. The result data set may be part of the new transaction or may be stored in the off-chain controller. If there is no data in the database that matches the query or the query terminates before completion of the query, the result data set is null. The completion of the query is recorded on the distributed ledger network or blockchain after receiving the result data set, thus the completion recorded includes both when there is data and when there is no data (null) in the result data set. In other words, even if the query terminates prematurely or there is no matching data, it is still recorded in the network. [0095] In an example, the database is a federated database 160 comprising a federation server 162 and a plurality of private data storage 170 (170a, 170b, 170c . . . ). Each private data storage 170 may be a data server which performs access control, differential privacy, and connects to a separate data storage medium such as a database or object storage medium on the cloud such as AWS S3 or Google Cloud Storage. Each private data storage 170 is not connected to each other but is connected to the federation server 162. The federation server 162 may be connected to a federation server storage 164 which may be used to store the results from each private data storage 164. The federation server 162 may also be provided with a private key 166. [0101] In an example, interim results from the private data storage 170 according to the data query plan are received in the federation server 162. The private data storage 170 may return a list of identifiers (IDs) that matches the query that was submitted to them. All the data may be sent back to the node 120 (alternatively a subset of the data or aggregated data may be sent to the node 120 instead), for example by the federation server 162. The interim results may be stored temporarily or permanently in the federation server storage 164. As an example, the data set from each private data storage 170 may be considered a set or a table. If data is not received from one or more private data storage 170 according to the query plan, then it will be marked as a failed database query. Failed queries will also be sent to the off-chain controller 128 as a signed deterministic transaction to be processed by the node 120. [0103] Depending on the nature of the query, differential privacy may be applied to produce aggregate-level differentially private information. For example, if a request was made to identify the average age of the data subjects in the private data storage 170, a cryptographic module within the private data storage 170 may be used to add a sufficient amount of noise, perform redaction threshold and data validation before releasing the differentially-private aggregated output back to the federation server 162. For example, if the true average age of the users in the intersection is 34 the differentially-private results for the aggregated average age might be 34.5 (or could even be 34.4, subjected to the amount of noise added to the age attribute of the data) because a mathematical noise has been added to the results. By doing so, an observer or user receiving the output cannot tell if a particular individual's information was used in the computation. [0104] Once the federated database query has been completed by the federation server 162, a response, which is either raw data (including a subset of the raw data) which fulfils the conditions of the query or aggregated data, may be stored to a federation server storage 164 and/or sent to the node (blockchain). The federation server 162 prepares a request payload and use its private key 166 to sign the request payload before broadcasting the signed transaction back to the node (blockchain) where it is processed as described above. Briefly, the transaction will first be sent to the interface 122, sent to a transaction pool 124 and mined into a block which will form the new state in the blockchain ledger 130. During the validation, the off-chain controller 128 identifies that an off-chain data response has been received, and it will mark the off-chain transaction (identified using a Query ID No.) has been completed.
Zhang et al. (US 20200034566 A1):
[0086] Consider the following problem within the aforementioned social media data outsourcing framework. After receiving a data query from the data consumer, the DSP 120 searches the entire social media database to generate a dataset D, which contains all the users satisfying the query and their outsourced texts (e.g., tweets, retweets, and replies) during the period specified in the query. Each user in D is assigned an anonymous ID to provide baseline user privacy. The data consumer may also request the social graph associated with D, in which case we assume that existing defenses such are adopted to preserve link and vertex privacy such that it is infeasible to link an anonymous ID to the real user based on his/her vertex's graphical property in the social graph. DSP 120 therefore transforms the raw dataset D into a new one D′ by perturbing the user texts according to the following three requirements: (I) Completeness: each data item in D can be mapped to a unique item in D′, and vice versa. In other words, no user is added to or deleted from D to create D′. (II) Privacy Preservation: The user texts in D′ can be used to link any anonymous ID in D′ to the real user with negligible probability, meaning that text-based user-linkage attacks can be thwarted with overwhelming probability. (III) High Utility: D′ and D should lead to comparable utility at the data consumer on common data mining tasks such as statistical aggregation, clustering, and classification. [0087] Given the preceding objectives and functionality, a novel technique to achieve differentially privacy-preserving social media data outsourcing with the aforementioned design functionality is provided utilizing text-indistinguishability as a foundational technique. [0052] At block 220, processing logic injects controlled noise into the user-keyword matrix until a specified differential privacy threshold is attained or surpassed, including at least anonymizing all user IDs represented within the social media data received as input while preserving both link privacy and vertex privacy.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ABDERRAHMEN H CHOUAT whose telephone number is (571)431-0695. The examiner can normally be reached on Mon-Fri from 9AM to 5PM PST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Christopher Parry, can be reached at telephone number 571-272-8328. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from Patent Center. Status information for published applications may be obtained from Patent Center. Status information for unpublished applications is available through Patent Center to authorized users only. Should you have questions about access to the USPTO patent electronic filing system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
Examiner interviews are available via a variety of formats. See MPEP § 713.01. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) Form at https://www.uspto.gov/InterviewPractice.
Abderrahmen Chouat
Examiner
Art Unit 2451
/Chris Parry/Supervisory Patent Examiner, Art Unit 2451