DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
The amendment filed 25 September 2025 has been entered. Applicant amended claims 14-20. Claims 1-20 remain pending.
Applicant amendment to the claims overcomes the 35 USC 112(b) rejection of 19 May 2025. Therefore the 35 USC 112(b) rejection of 19 May 2025 has been withdrawn.
Response to Arguments
Regarding the 35 USC 112(b) rejection:
Applicant’s arguments, filed 25 September 2025, with respect to 35 USC 112(b) rejection of 19 May 2025 have been fully considered and are persuasive. The 35 USC 112(b) rejection of 19 May 2025 has been withdrawn.
Regarding the 35 USC 103 rejection:
Applicant's arguments filed 25 September 2025 have been fully considered but they are not persuasive.
Applicant’s argument 1:
Applicant respectfully submits that Hodgman, Butler, Hoa, and Bezzi, either alone or in combination, fail to teach or suggest at least the following features as recited in independent claim 1. In particular, independent claim 1 recites "causing, by the classifier component, information that classifies the at least one data field as a sensitive data field to be stored in a data catalog without sending content of any data field in the first set of data fields to the data catalog... “.
Applicant’s support:
Hodgman, at best, describes that the classifiers can include asset classifier, user classifier, and threat classifier, among other such classifiers. The asset classifier is trained to analyze asset data to classify an asset into a physical classification or a role classification, whereas a role classification can include software development, medical services, or finance. However, Hodgman fails to teach or suggest storing classified data field in a data catalog without sending content of any data field to the data catalog.
In contrast, the present application expressly distinguishes between a "data field" (i.e., the category of data) and the "content" or "data item" (i.e., the actual value stored in that field). In particular, paragraph [0026] of the as-filed specification recites "Each datastore 14 may comprise a plurality of records that contain a same set of data fields. The term data field refers to a portion of a record in a datastore. The term "content" or "data item", as used herein, refers to the actual data that is stored in a data field. For example, a Name data field of a first record in the datastore 14-1 may store the name "Bob Johnson". "Bob Johnson" is the content of the Name data field for the first record, and may also be referred to as the data item of the Name data field for the first record. The Name data field of a second record in the datastore 14-1 may store the name "John Smith". Thus, the data fields are the same in each record of the datastore 14-1, but the data items (i.e., content) of the data fields may differ from record to record. Solely as an example, the datastore 14-1 may be a "customer" datastore and may include a set of data fields such as a Name data field, an address data field, a city data field, a state data field, and a zip code data field. The datastores 14 may number in the hundreds or thousands.
Thus, Hodgman fails to teach or suggest storing only categories of the content in a data catalog without storing the actual content.
Examiner’s remarks:
Paragraph 26 of Applicant’s specification recites “The term data fields refers to a portion of a record in a datastore”. Furthermore, it is the interpretation that the classifier component cause “the classified information of the at least one data field (a portion of a record in a datastore) as a sensitive data field” to be stored, but the classifier component may not necessarily directly send “content…”. Cause” and “To be stored” does not necessarily recite a positive storage of the data field by the classifier component itself without sending content of any data field in the first set of data fields…., but rather a potential future storage of the data by another component or application. Therefore, applying the broadest reasonable interpretation in light of the specification, the examiner maintains the rejection, providing the interpretation as best understood. Note: The examiner has also updated typographical error in the office action where the examiner duplicated “causing information that classifies the at least one data field as a sensitive data field to be stored in a data catalog…”.
Applicant’s argument 2:
Applicant respectfully submits that Hodgman, Butler, Hoa, and Bezzi, either alone or in combination, fail to teach or suggest at least the following features as recited in independent claim 1. In particular, independent claim 1 recites "determining, by the data security component based on the data catalog, that the query requested content from a sensitive data field... “.
Applicant’s support 2: see pages 13-14 of Applicant’s remarks
Examiner’s remarks:
Paragraph 54 of Hodgman discloses “Security threat data and new vulnerability definition data can include, for example, known data describing various security threats directed to users, assets, or a combination thereof, as well as data that may potentially pose a security threat to users and assets….”. Information/identifier of the user, security threat can be a sensitive data field or “a portion of a record in a datastore” of a data catalog. Therefore, applying the broadest reasonable interpretation in light of the specification, the examiner maintains the rejection as pointed out, providing the interpretation as best understood.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f):
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f). The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f), is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f). The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f), is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f), except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f), except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f), because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are:
Claim 14 recitation of “one or more computing devices operable to receive…; analyze…; cause…; subsequently access…; determine…; and store…” is interpreted per the description disclosed in paragraph 28 of Applicant’s specification.
Claim 17 recitation of “the one or more computing devices are further operable to ….obtain…; parse…; remove…; send…” …” is interpreted per the description disclosed in paragraph 28 of Applicant’s specification.
Claim 18 recitation of “one or more computing devices to receive…; analyze…; cause…; subsequently access…; determine…; and store…” is interpreted per the description disclosed in paragraph 28 of Applicant’s specification.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-5, 14-16, and 18-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Hodgman et al US 20200053115 (hereinafter Hodgman), in view of Butler et al US 20200380212 (hereinafter Butler), in further view of Hoa US 20200293681 (hereinafter Hoa), and in further view of Bezzi US 20150007249 (hereinafter Bezzi).
As to claim 1, Hodgman teaches a method (paragraph 1 discloses the disclosure pertains to method of analyzing data), comprising:
receiving, by a classifier component executing on one or more processor devices (paragraph 55 discloses a classifier may execute any suitable machine learning procedures, rule-based classification techniques, heuristic techniques, or some combination thereof), an instruction to determine whether any [data ] in a first datastore are sensitive [data type] in which sensitive data is stored (paragraph 23 discloses the instructions of the non-transitory computer readable storage medium, when executed by the at least one processor, further enable the computing system to provide an asset classifier, a user classifier, and a threat classifier, wherein the classifiers classify/determine different categories for the data, such as an asset as role asset, classify a user being associated with one of an employee type, a group type, or a role type. Role asset, employee type, role type are sensitive data types. As shown in Figures 2-3 and paragraphs 49-50, the classifier 204 within the threat analysis system receives the data from a data source 104 via interface 216. Paragraph 6 reveals at least one data store in a service provider environment maintain at least three data sets from a plurality of data sources, each data set including information from one of assets, users, or security threats Paragraph 7 reveals wherein the asset data set includes first identification information identifying individual devices on a network, the user data set includes second identifying information identifying user accounts associated with the individual devices, and the threat data set includes third identification information identifying threats to one of a device or an user account), the first datastore comprising a first data structure comprising a plurality of records, (Figure 3 and paragraph 54 reveal the threat analysis system receives data from a number of data sources such as data warehouses. Paragraph 6 reveals at least one data store in a service provider environment maintain at least three data sets from a plurality of data sources/records, each data set including information for one of assets, users, or security threats. Data set/data warehouses are a type of data structure);
analyzing, by the classifier component, the first set of [data] and determining that at least one [data] is a sensitive [data type] (paragraph 55 discloses the classifier is trained to analyze the data and classify the data into role classification (medical services, finances), employee type classification because the analyzed asset data can include, for example, information that identifies an electronic device, service, or other resource of a provider, and user data include data from network logs, organization chart information, employment records. See Figure 4, step 404. Role classification of medical services and finances and employee type are sensitive data types);
causing, by the classifier component, information that classifies the at least one data field as a sensitive [data type] to be stored in a data catalog without sending content of any data field in the first set of data fields to the data catalog (paragraph 55 reveals the dataset is analyzed by the classifier to augment the data into classification types. The classifier is trained to analyze the data and classify the data into role classification (medical services, finances), employee type classification because the analyzed asset data can include, for example, information that identifies an electronic device, service, or other resource of a provider, and the user data include data from network logs, organization chart information, employment records. Paragraphs 52 and 56 reveal the augment data by the classifier is stored in various data catalogs. Augmenting data to the data catalog involves enriching dataset via classification into classified data type/metadata and inputting the classified data type/metadata into a catalog. See Figure 4, step 404);
subsequently accessing, by a data security component executing on the one or more processor devices (paragraph 52 discloses the query is received from a query source and directed to a query component/data security component. Paragraph 21 discloses a non-transitory computer readable storage medium stores instructions that, when executed by at least one processor of a computing system, causes the computing system to receive a query associated with a subject, the subject being at least one of an asset, a user, or a security threat), a query [associated with] the first datastore, the query including a data field name that identifies the at least one data field (paragraph 62 discloses subsequent to step 404 and 406 of Figure 4, a query associated with a subject/data field is received. The query can be automated or manual. The subject includes information such as an identifier or other data associated with a particular data type (asset, user, or security threat data). Paragraph 52 discloses the query is received from a query source and directed to a query component/data security component. The query is associated with subject of a data source);
determining, by the data security component based on the data catalog, that the query requested content (paragraphs 52 and 62 disclose the query is analyzed by the query component to determine a subject associated with the query or at least identify a type of query. The subject can include information such as an identifier or other data associated with a particular asset, user. Mapping information, such as a lookup table, can be used to tag or otherwise identify at least one of an asset, a user, or security threat associated with the subject. Query component/data security component can direct the query to an appropriate correlator component based on the subject of the query. Mapping information, such as a lookup table, can be used to tag or otherwise identify at least one of an asset, a user, or security threat associated with the subject. The mapping information can be used to determine insights between the catalog information based on at least one of the asset, the user, or the security threat associated with the subject).
While Hodges teaches classifying the data in a data source, Hodges does not teach that the data pertains to data field(s), and thus does not teach receiving instructions to determine whether any data fields in a first datastore are sensitive data fields; each record comprising a first set of data fields; analyzing the first set of data fields and determining that at least one data field is a sensitive data field; causing information that classifies the at least one data field as a sensitive data field to be stored in a data catalog; a query made to a first datastore; determining that the query requested content from a sensitive data field, and storing, by the data security component, information that the query requested the content from the sensitive data field.
Butler teaches receiving instructions to determine whether any data fields in a first datastore are sensitive data fields (paragraphs 34-35 reveal an execution system configured to profile source data received from data sources and classify the source data and associate portions of the source data with labels representing the semantic meaning of those portions of the source data. A portion of the source data can include a field in the source data. Paragraph 44 reveals the label can indicate whether a particular field includes sensitive data such as PII); each record comprising a first set of data fields (paragraphs 5 and 34-35 disclose a portion of the source data can include data fields); analyzing the first set of data fields and determining that at least one data field is a sensitive data field (paragraphs 36-40 disclose classifying each field as having a data type and determining a label for the data field. Paragraph 44 reveals the label can indicate whether a particular field includes sensitive data such as PII); causing information that classifies the at least one data field as a sensitive data field to be stored in a data catalog (paragraph 39 discloses load data module sends the classified labels/ label index to a reference database/catalog).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the data from a first data source in Hodgman’s teachings of analyzing the data and classifying the data with Butler’s teachings of classifying and labeling data fields from a data source such that applications that can use the generated labels of data sets can include data quality enforcement, personal data anonymization, data masking, personally identifiable information (PII) reports, test data management, data set annotation, and so forth. Furthermore, such modification can allow the system administrator to know and understand what data is in the data set stored on the system, such as for regulatory reasons (paragraph 6 of Butler).
The combination of Hodgman in view of Butler does not teach a query made to a first datastore; determining that the query requested content from a sensitive data field, and storing, by the data security component, information that the query requested the content from the sensitive data field.
Hoa teaches accessing, by a data security component, a query made to a first datastore and determining that the query requested content from a sensitive data field (paragraph 66 discloses a security engine/data security component receives/access a request to access data in a personnel database. The security engine determines that the requested data is stored in one or more database columns (data fields) of a personal database corresponding to sensitive information).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to further modify the queries of Hodgman with the modification of the data from a first data source in Hodgman’s teachings of analyzing the data, classifying the data, and accessing a query in view of Butler’s teachings of classifying and labeling data fields from a data source with Hoa’s teachings of determining whether query requested content from a sensitive data field to efficiently restrict and track the queried access of sensitive data in the database without detrimentally impacting the database or the security of the stored data and to improve the auditing of access to such sensitive information (paragraph 2 of Hoa).
The combination of Hodgman in view of Butler does not teach storing, by the data security component, information that the query requested the content from the sensitive data field.
Bezzi teaches storing, by the data security component, information that the query requested the content from the sensitive data field (paragraphs 31-32 disclose a processor/data security component stores results of the query in a temporary data store, one of the columns of the results of the query may include data associated with sensitive identifier of social security number. Paragraphs 16 and 22- 24 disclose the data stored in the database tables and the columns of the table may be classified as identifiers and sensitive attributes and are associated with a privacy risk).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to further modify the queries of Hodgman with the modification of the data from a first data source in Hodgman’s teachings of analyzing the data, classifying the data, and accessing a query and of Butler’s teachings of classifying and labeling data fields from a data source with Hoa’s teachings of determining whether query requested content from a sensitive data field to further include storing the query result as taught by Bezzi to prevent users from retrieving the data from the databases as soon as the database provide the information. By storing the resultant query information first, the system can further secure the query result data for anonymization as needed according to specific needs/authorizations of the requestor to prevent data leakage (paragraphs 3 and 5 of Bezzi).
As to claim 2, the combination of Hodgman in view of Butler and Bezzi teaches wherein analyzing, by the classifier component, the first set of data fields and determining that the at least one data field is a sensitive data field (see claim 1 mapping above) comprises: accessing, by the classifier component, a subset of records of the plurality of records (Butler: paragraphs 51 and 58 disclose the classification module access the fields of the data source and generate profile data from the datasets/records. The profile data module can discover fields by identifying rows of tables in the source data, finding field names, references to fields, or using any similar process. The profile data module determines statistical attribute(s) of the data fields and generates profile data including those statistical attributes. The profile data identifies patterns in the source data. More specifically, the profile data includes statistics about the values of data fields of tables of the source data. For example, the profile data can include information specifying whether the data values of a data field include numerical data, character strings, etc. For example, the statistics about the data values can include a maximum value, a minimum value, a standard deviation, a mean, and so forth of the values that are included in each of the data fields (if the data are numerical). In some implementations, the statistics about the data can include how many digits or characters are in each entry of the data values. The profile data is the subset of records); determining that content stored in the at least one data field in each record in the subset of records comprises sensitive data (Butler: paragraphs 58-59 disclose the classification module classify the data fields using the profile data. Paragraphs 36-40 disclose classifying each field as having a data type and determining a label for the data field. Paragraph 44 reveals the label can indicate whether a particular field includes sensitive data such as PII) ; and in response to determining that the content stored in the at least one data field in each record in the subset of records comprises sensitive data, determining that the at least one data field is a sensitive data field (Butler: paragraphs 58-59 disclose the classification module classify the data fields using the profile data. Paragraphs 36-40 disclose classifying each field as having a data type and determining a label for the data field. Paragraph 44 reveals the label can indicate whether a particular field includes sensitive data such as PII). Motivation similar to the motivation presented in claim 1.
As to claim 3, the combination of Hodgman in view of Butler and Bezzi teaches wherein determining that the content stored in the at least one data field in each record in the subset of records (see claim 1 and claim 2 mapping above) comprises sensitive data further comprises: processing the content with at least one regular expression (Butler: paragraphs 35 and 51 disclose for the profiling of the data, a profile data module identifies patterns in the source data. More specifically, the profile data includes statistics about the values of data fields of tables of the source data. For example, the profile data can include information specifying whether the data values of a data field include numerical data, character strings, etc. For example, the statistics about the data values can include a maximum value, a minimum value, a standard deviation, a mean, and so forth of the values that are included in each of the data fields (if the data are numerical). In some implementations, the statistics about the data can include how many digits or characters are in each entry of the data values. For example, the data profile can indicate that each data value of a data field includes seven (or ten) numbers, which may provide a contextual clue indicating that the data field includes telephone numbers); and determining, based on processing the content with the at least one regular expression, that the at least one data item is a sensitive data item (Butler: paragraphs 36 and 44 disclose each field is classified based on the profile data and is labeled. A label index provides a quick reference for the downstream applications to determine the meaning of the data values of the dataset without the downstream application having to analyze the dataset. For example, an application need only refer to the label index to determine the semantic meaning of a field. The label index can indicate whether a particular field includes personally identifying information (PII)). Motivation similar to the motivation presented in claim 1.
As to claim 4, the combination of Hodgman in view of Butler and Bezzi teaches wherein analyzing, by the classifier component, the first set of data fields and determining that the at least one data field is a sensitive data field (see claim 1 mapping above) comprises: accessing a datastore schema that identifies data field names that correspond respectively to the data fields in the first set of data fields (Butler: paragraphs 58-62 disclose for each field, the classification module is configured to look up the label index including existing labels for discovered fields of the source data from a reference database/datastore schema. A data dictionary database is further used if there is not a match with the existing labels from the references database); comparing the data field names to predetermined words; and based on comparing the data field names to the predetermined words (Butler: paragraph 65 reveals a testing module performs classification tests on the field names to determine how to label the data field. Examples are shown in paragraphs 65-69, which disclose using the data content of the fields to determine a data type of the field. Identify a data type of a field involves determining that the data are numerical. The profile data also indicates that each entry in the data field is 13-18 characters long. This may indicate to the testing module that the data field may be a credit card number data field. To confirm this, one or more pattern tests can be executed by the testing module against the data of the suspect data field. For example, the first 4-6 digits for each entry can be checked against a table of issuer codes. The last number can include a check digit defined by a Luhn test. If a threshold percentage of the entries for the data field satisfy each of these patterns, the testing module can conclude that the field holds credit card numbers, and associate the field name with the appropriate label and probability. For the pattern matching logic, both the data itself of a given field and the patterns of the data in the field (e.g., identified in the profile data) can be used to discern which pattern tests to run and what labels to apply to the given data field), determining that the at least one data field is a sensitive data field (Butler: paragraphs 65-69 disclose using the data content of the fields to determine a data type of the field; identify a data type of a field involves determining that the data are numerical. The profile data also indicates that each entry in the data field is 13-18 characters long, which indicate to the testing module that the data field may be a credit card number data field/sensitive data field. Paragraphs 36 and 44 further disclose each field is classified based on the profile data and is labeled. A label index provides a quick reference for the downstream applications to determine the meaning of the data values of the dataset without the downstream application having to analyze the dataset. For example, an application need only refer to the label index to determine the semantic meaning of a field. The label index can indicate whether a particular field includes personally identifying information (PII)). Motivation similar to the motivation presented in claim 1.
As to claim 5, the combination of Hodgman in view of Butler and Bezzi teaches wherein the classifier component (Hodgman: Figure 2, reference number 204 and paragraphs 49-50) executes in a restricted computing environment requiring authorization to access the first datastore (Hodgman: paragraph 49 reveals classifier component is in the threat analysis system component 202. Paragraph 46 reveals the threat analysis system 202 receives user authentication data that is associated with an access policy that identifies access rights of a user, including access to one or more assets) , and wherein the data security component executes in an environment external to the restricted computing environment and has no access to the first datastore (Hodgman: Figure 2, reference 218 “query source”/data security component is external to the threat analysis system 202 that contains the classifier component 204. The query source includes include authorized users of a service provider and does not have direct access to the data source 104).
As to claim 14, Hodgman teaches a computer system (Figure 6 and paragraph 64 disclose basic components of a computing device in accordance with the disclosure; paragraph 1 discloses the disclosure pertains to system and method of analyzing data) comprising: one or more computing devices operable to (paragraph 64 discloses the computing device includes at least one central processor for executing instructions that can be stored in at least one memory device or element. The instructions, when executed by the processor, can enable processor to cause implement the method):
Receive (paragraph 55 discloses a classifier may execute any suitable machine learning procedures, rule-based classification techniques, heuristic techniques, or some combination thereof), an instruction to determine whether any [data ] in a first datastore are sensitive [data type] in which sensitive data is stored (paragraph 23 discloses the instructions of the non-transitory computer readable storage medium, when executed by the at least one processor, further enable the computing system to provide an asset classifier, a user classifier, and a threat classifier, wherein the classifiers classify/determine different categories for the data, such as an asset as role asset, classify a user being associated with one of an employee type, a group type, or a role type. Role asset, employee type, role type are sensitive data types. As shown in Figures 2-3 and paragraphs 49-50, the classifier 204 within the threat analysis system receives the data from a data source 104 via interface 216. Paragraph 6 reveals at least one data store in a service provider environment maintain at least three data sets from a plurality of data sources, each data set including information from one of assets, users, or security threats Paragraph 7 reveals wherein the asset data set includes first identification information identifying individual devices on a network, the user data set includes second identifying information identifying user accounts associated with the individual devices, and the threat data set includes third identification information identifying threats to one of a device or an user account), the first datastore comprising a first data structure comprising a plurality of records, (Figure 3 and paragraph 54 reveal the threat analysis system receives data from a number of data sources such as data warehouses. Paragraph 6 reveals at least one data store in a service provider environment maintain at least three data sets from a plurality of data sources/records, each data set including information for one of assets, users, or security threats. Data set/data warehouses are a type of data structure);
Analyze (paragraph 55 discloses the classifier is trained to analyze the data and classify the data into role classification (medical services, finances), employee type classification because the analyzed asset data can include, for example, information that identifies an electronic device, service, or other resource of a provider, and user data include data from network logs, organization chart information, employment records. See Figure 4, step 404. Role classification of medical services and finances and employee type are sensitive data types);
cause information that classifies the at least one data field as a sensitive [data type] to be stored in a data catalog without sending content of any data field in the first set of data fields to the data catalog (paragraph 55 reveals the dataset is analyzed by the classifier to augment the data into classification types. The classifier is trained to analyze the data and classify the data into role classification (medical services, finances), employee type classification because the analyzed asset data can include, for example, information that identifies an electronic device, service, or other resource of a provider, and the user data include data from network logs, organization chart information, employment records. Paragraphs 52 and 56 reveal the augment data by the classifier is stored in various data catalogs. Augmenting data to the data catalog involves enriching dataset via classification into classified data type/metadata and inputting the classified data type/metadata into a catalog. See Figure 4, step 404);
subsequently access (paragraph 52 discloses the query is received from a query source and directed to a query component/data security component. Paragraph 21 discloses a non-transitory computer readable storage medium stores instructions that, when executed by at least one processor of a computing system, causes the computing system to receive a query associated with a subject, the subject being at least one of an asset, a user, or a security threat), a query [associated with] the first datastore, the query including a data field name that identifies the at least one data field (paragraph 62 discloses subsequent to step 404 and 406 of Figure 4, a query associated with a subject/data field is received. The query can be automated or manual. The subject includes information such as an identifier or other data associated with a particular data type (asset, user, or security threat data). Paragraph 52 discloses the query is received from a query source and directed to a query component/data security component. The query is associated with subject of a data source);
determine that the query requested content (paragraphs 52 and 62 disclose the query is analyzed by the query component to determine a subject associated with the query or at least identify a type of query. The subject can include information such as an identifier or other data associated with a particular asset, user. Mapping information, such as a lookup table, can be used to tag or otherwise identify at least one of an asset, a user, or security threat associated with the subject. Query component/data security component can direct the query to an appropriate correlator component based on the subject of the query. Mapping information, such as a lookup table, can be used to tag or otherwise identify at least one of an asset, a user, or security threat associated with the subject. The mapping information can be used to determine insights between the catalog information based on at least one of the asset, the user, or the security threat associated with the subject).
While Hodges teaches classifying the data in a data source, Hodges does not teach that the data pertains to data field(s), and thus does not teach receiving instructions to determine whether any data fields in a first datastore are sensitive data fields; each record comprising a first set of data fields; analyzing the first set of data fields and determining that at least one data field is a sensitive data field; causing information that classifies the at least one data field as a sensitive data field to be stored in a data catalog; a query made to a first datastore; determining that the query requested content from a sensitive data field, and storing information that the query requested the content from the sensitive data field.
Butler teaches receiving instructions to determine whether any data fields in a first datastore are sensitive data fields (paragraphs 34-35 reveal an execution system configured to profile source data received from data sources and classify the source data and associate portions of the source data with labels representing the semantic meaning of those portions of the source data. A portion of the source data can include a field in the source data. Paragraph 44 reveals the label can indicate whether a particular field includes sensitive data such as PII); each record comprising a first set of data fields (paragraphs 5 and 34-35 disclose a portion of the source data can include data fields); analyzing the first set of data fields and determining that at least one data field is a sensitive data field (paragraphs 36-40 disclose classifying each field as having a data type and determining a label for the data field. Paragraph 44 reveals the label can indicate whether a particular field includes sensitive data such as PII); causing information that classifies the at least one data field as a sensitive data field to be stored in a data catalog (paragraph 39 discloses load data module sends the classified labels/ label index to a reference database/catalog).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the data from a first data source in Hodgman’s teachings of analyzing the data and classifying the data with Butler’s teachings of classifying and labeling data fields from a data source such that applications that can use the generated labels of data sets can include data quality enforcement, personal data anonymization, data masking, personally identifiable information (PII) reports, test data management, data set annotation, and so forth. Furthermore, such modification can allow the system administrator to know and understand what data is in the data set stored on the system, such as for regulatory reasons (paragraph 6 of Butler).
The combination of Hodgman in view of Butler does not teach a query made to a first datastore; determining that the query requested content from a sensitive data field, and storing information that the query requested the content from the sensitive data field.
Hoa teaches accessing, by a data security component, a query made to a first datastore and determining that the query requested content from a sensitive data field (paragraph 66 discloses a security engine/data security component receives/access a request to access data in a personnel database. The security engine determines that the requested data is stored in one or more database columns (data fields) of a personal database corresponding to sensitive information).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to further modify the queries of Hodgman with the modification of the data from a first data source in Hodgman’s teachings of analyzing the data, classifying the data, and accessing a query in view of Butler’s teachings of classifying and labeling data fields from a data source with Hoa’s teachings of determining whether query requested content from a sensitive data field to efficiently restrict and track the queried access of sensitive data in the database without detrimentally impacting the database or the security of the stored data and to improve the auditing of access to such sensitive information (paragraph 2 of Hoa).
The combination of Hodgman in view of Butler does not teach storing information that the query requested the content from the sensitive data field.
Bezzi teaches storing information that the query requested the content from the sensitive data field (paragraphs 31-32 disclose a processor/data security component stores results of the query in a temporary data store before anonymization of the query results, one of the columns of the results of the query may include data associated with sensitive identifier of social security number. Paragraphs 16 and 22- 24 disclose the data stored in the database tables and the columns of the table may be classified as identifiers and sensitive attributes and are associated with a privacy risk).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to further modify the queries of Hodgman with the modification of the data from a first data source in Hodgman’s teachings of analyzing the data, classifying the data, and accessing a query and of Butler’s teachings of classifying and labeling data fields from a data source with Hoa’s teachings of determining whether query requested content from a sensitive data field to further include storing the query result as taught by Bezzi to prevent users from retrieving the data from the databases as soon as the database provide the information. By storing the resultant query information first, the system can further secure the data for anonymization as needed according to specific needs/authorizations of the requestor to prevent data leakage (paragraphs 3 and 5 of Bezzi).
As to claim 15, the combination of Hodgman in view of Butler and Bezzi teaches wherein analyzing the first set of data fields and determining that the at least one data field is a sensitive data field (see claim 14 mapping above) comprises: access a subset of records of the plurality of records (Butler: paragraphs 51 and 58 disclose the classification module access the fields of the data source and generate profile data from the datasets/records. The profile data module can discover fields by identifying rows of tables in the source data, finding field names, references to fields, or using any similar process. The profile data module determines statistical attribute(s) of the data fields and generates profile data including those statistical attributes. The profile data identifies patterns in the source data. More specifically, the profile data includes statistics about the values of data fields of tables of the source data. For example, the profile data can include information specifying whether the data values of a data field include numerical data, character strings, etc. For example, the statistics about the data values can include a maximum value, a minimum value, a standard deviation, a mean, and so forth of the values that are included in each of the data fields (if the data are numerical). In some implementations, the statistics about the data can include how many digits or characters are in each entry of the data values. The profile data is the subset of records); determine that content stored in the at least one data field in each record in the subset of records comprises sensitive data (Butler: paragraphs 58-59 disclose the classification module classify the data fields using the profile data. Paragraphs 36-40 disclose classifying each field as having a data type and determining a label for the data field. Paragraph 44 reveals the label can indicate whether a particular field includes sensitive data such as PII) ; and in response to determining that the content stored in the at least one data field in each record in the subset of records comprises sensitive data, determine that the at least one data field is a sensitive data field (Butler: paragraphs 58-59 disclose the classification module classify the data fields using the profile data. Paragraphs 36-40 disclose classifying each field as having a data type and determining a label for the data field. Paragraph 44 reveals the label can indicate whether a particular field includes sensitive data such as PII). Motivation similar to the motivation presented in claim 14.
As to claim 16, the combination of Hodgman in view of Butler and Bezzi teaches wherein the classification of the at least one data field as sensitive data field is performed in a restricted computing environment requiring authorization to access the first datastore (Hodgman: paragraph 49 reveals classifier component is in the threat analysis system component 202. Paragraph 46 reveals the threat analysis system 202 receives user authentication data that is associated with an access policy that identifies access rights of a user, including access to one or more assets), and wherein the determination based on the catalog is performed in an environment external to the restricted computing environment and has no access to the first datastore (Hodgman: Figure 2, reference 218 “query source”/data security component is external to the threat analysis system 202 that contains the classifier component 204. The query source includes include authorized users of a service provider and does not have direct access to the data source 104).
As to claim 18, Hodgman teaches a non-transitory computer-readable storage medium that includes executable instructions operable to cause one or more computing devices to (Figure 6 and paragraph 64 disclose basic components of a computing device in accordance with the disclosure; paragraph 1 discloses the disclosure pertains to system and method of analyzing data; paragraph 64 discloses the computing device includes at least one central processor for executing instructions that can be stored in at least one memory device or element. The instructions, when executed by the processor, can enable processor to cause implement the method):
Receive (paragraph 55 discloses a classifier may execute any suitable machine learning procedures, rule-based classification techniques, heuristic techniques, or some combination thereof), an instruction to determine whether any [data ] in a first datastore are sensitive [data type] in which sensitive data is stored (paragraph 23 discloses the instructions of the non-transitory computer readable storage medium, when executed by the at least one processor, further enable the computing system to provide an asset classifier, a user classifier, and a threat classifier, wherein the classifiers classify/determine different categories for the data, such as an asset as role asset, classify a user being associated with one of an employee type, a group type, or a role type. Role asset, employee type, role type are sensitive data types. As shown in Figures 2-3 and paragraphs 49-50, the classifier 204 within the threat analysis system receives the data from a data source 104 via interface 216. Paragraph 6 reveals at least one data store in a service provider environment maintain at least three data sets from a plurality of data sources, each data set including information from one of assets, users, or security threats Paragraph 7 reveals wherein the asset data set includes first identification information identifying individual devices on a network, the user data set includes second identifying information identifying user accounts associated with the individual devices, and the threat data set includes third identification information identifying threats to one of a device or an user account), the first datastore comprising a first data structure comprising a plurality of records, (Figure 3 and paragraph 54 reveal the threat analysis system receives data from a number of data sources such as data warehouses. Paragraph 6 reveals at least one data store in a service provider environment maintain at least three data sets from a plurality of data sources/records, each data set including information for one of assets, users, or security threats. Data set/data warehouses are a type of data structure);
analyze the first set of [data] and determining that at least one [data] is a sensitive [data type] (paragraph 55 discloses the classifier is trained to analyze the data and classify the data into role classification (medical services, finances), employee type classification because the analyzed asset data can include, for example, information that identifies an electronic device, service, or other resource of a provider, and user data include data from network logs, organization chart information, employment records. See Figure 4, step 404. Role classification of medical services and finances and employee type are sensitive data types);
cause information that classifies the at least one data field as a sensitive [data type] to be stored in a data catalog without sending content of any data field in the first set of data fields to the data catalog (paragraph 55 reveals the dataset is analyzed by the classifier to augment the data into classification types. The classifier is trained to analyze the data and classify the data into role classification (medical services, finances), employee type classification because the analyzed asset data can include, for example, information that identifies an electronic device, service, or other resource of a provider, and the user data include data from network logs, organization chart information, employment records. Paragraphs 52 and 56 reveal the augment data by the classifier is stored in various data catalogs. Augmenting data to the data catalog involves enriching dataset via classification into classified data type/metadata and inputting the classified data type/metadata into a catalog. See Figure 4, step 404);
subsequently access (paragraph 52 discloses the query is received from a query source and directed to a query component/data security component. Paragraph 21 discloses a non-transitory computer readable storage medium stores instructions that, when executed by at least one processor of a computing system, causes the computing system to receive a query associated with a subject, the subject being at least one of an asset, a user, or a security threat), a query [associated with] the first datastore, the query including a data field name that identifies the at least one data field (paragraph 62 discloses subsequent to step 404 and 406 of Figure 4, a query associated with a subject/data field is received. The query can be automated or manual. The subject includes information such as an identifier or other data associated with a particular data type (asset, user, or security threat data). Paragraph 52 discloses the query is received from a query source and directed to a query component/data security component. The query is associated with subject of a data source);
determine based on the data catalog, that the query requested content (paragraphs 52 and 62 disclose the query is analyzed by the query component to determine a subject associated with the query or at least identify a type of query. The subject can include information such as an identifier or other data associated with a particular asset, user. Mapping information, such as a lookup table, can be used to tag or otherwise identify at least one of an asset, a user, or security threat associated with the subject. Query component/data security component can direct the query to an appropriate correlator component based on the subject of the query. Mapping information, such as a lookup table, can be used to tag or otherwise identify at least one of an asset, a user, or security threat associated with the subject. The mapping information can be used to determine insights between the catalog information based on at least one of the asset, the user, or the security threat associated with the subject).
While Hodges teaches classifying the data in a data source, Hodges does not teach that the data pertains to data field(s), and thus does not teach receiving instructions to determine whether any data fields in a first datastore are sensitive data fields; each record comprising a first set of data fields; analyzing the first set of data fields and determining that at least one data field is a sensitive data field; causing information that classifies the at least one data field as a sensitive data field to be stored in a data catalog; a query made to a first datastore; determining that the query requested content from a sensitive data field, and storing information that the query requested the content from the sensitive data field.
Butler teaches receiving instructions to determine whether any data fields in a first datastore are sensitive data fields (paragraphs 34-35 reveal an execution system configured to profile source data received from data sources and classify the source data and associate portions of the source data with labels representing the semantic meaning of those portions of the source data. A portion of the source data can include a field in the source data. Paragraph 44 reveals the label can indicate whether a particular field includes sensitive data such as PII); each record comprising a first set of data fields (paragraphs 5 and 34-35 disclose a portion of the source data can include data fields); analyzing the first set of data fields and determining that at least one data field is a sensitive data field (paragraphs 36-40 disclose classifying each field as having a data type and determining a label for the data field. Paragraph 44 reveals the label can indicate whether a particular field includes sensitive data such as PII); causing information that classifies the at least one data field as a sensitive data field to be stored in a data catalog (paragraph 39 discloses load data module sends the classified labels/ label index to a reference database/catalog).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the data from a first data source in Hodgman’s teachings of analyzing the data and classifying the data with Butler’s teachings of classifying and labeling data fields from a data source such that applications that can use the generated labels of data sets can include data quality enforcement, personal data anonymization, data masking, personally identifiable information (PII) reports, test data management, data set annotation, and so forth. Furthermore, such modification can allow the system administrator to know and understand what data is in the data set stored on the system, such as for regulatory reasons (paragraph 6 of Butler).
The combination of Hodgman in view of Butler does not teach a query made to a first datastore; determining that the query requested content from a sensitive data field, and storing information that the query requested the content from the sensitive data field.
Hoa teaches accessing, by a data security component, a query made to a first datastore and determining that the query requested content from a sensitive data field (paragraph 66 discloses a security engine/data security component receives/access a request to access data in a personnel database. The security engine determines that the requested data is stored in one or more database columns (data fields) of a personal database corresponding to sensitive information).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to further modify the queries of Hodgman with the modification of the data from a first data source in Hodgman’s teachings of analyzing the data, classifying the data, and accessing a query in view of Butler’s teachings of classifying and labeling data fields from a data source with Hoa’s teachings of determining whether query requested content from a sensitive data field to efficiently restrict and track the queried access of sensitive data in the database without detrimentally impacting the database or the security of the stored data and to improve the auditing of access to such sensitive information (paragraph 2 of Hoa).
The combination of Hodgman in view of Butler does not teach storing information that the query requested the content from the sensitive data field.
Bezzi teaches storing, by the data security component, information that the query requested the content from the sensitive data field (paragraphs 31-32 disclose a processor/data security component stores results of the query in a temporary data store before anonymization of the query results, one of the columns of the results of the query may include data associated with sensitive identifier of social security number. Paragraphs 16 and 22- 24 disclose the data stored in the database tables and the columns of the table may be classified as identifiers and sensitive attributes and are associated with a privacy risk).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to further modify the queries of Hodgman with the modification of the data from a first data source in Hodgman’s teachings of analyzing the data, classifying the data, and accessing a query and of Butler’s teachings of classifying and labeling data fields from a data source with Hoa’s teachings of determining whether query requested content from a sensitive data field to further include storing the query result as taught by Bezzi to prevent users from retrieving the data from the databases as soon as the database provide the information. By storing the resultant query information first, the system can further secure the data for anonymization as needed according to specific needs/authorizations of the requestor to prevent data leakage (paragraphs 3 and 5 of Bezzi).
As to claim 19, the combination of Hodgman in view of Butler and Bezzi teaches wherein analyzing the first set of data fields and determining that the at least one data field is a sensitive data field (see claim 14 mapping above) comprises: access a subset of records of the plurality of records (Butler: paragraphs 51 and 58 disclose the classification module access the fields of the data source and generate profile data from the datasets/records. The profile data module can discover fields by identifying rows of tables in the source data, finding field names, references to fields, or using any similar process. The profile data module determines statistical attribute(s) of the data fields and generates profile data including those statistical attributes. The profile data identifies patterns in the source data. More specifically, the profile data includes statistics about the values of data fields of tables of the source data. For example, the profile data can include information specifying whether the data values of a data field include numerical data, character strings, etc. For example, the statistics about the data values can include a maximum value, a minimum value, a standard deviation, a mean, and so forth of the values that are included in each of the data fields (if the data are numerical). In some implementations, the statistics about the data can include how many digits or characters are in each entry of the data values. The profile data is the subset of records); determine that content stored in the at least one data field in each record in the subset of records comprises sensitive data (Butler: paragraphs 58-59 disclose the classification module classify the data fields using the profile data. Paragraphs 36-40 disclose classifying each field as having a data type and determining a label for the data field. Paragraph 44 reveals the label can indicate whether a particular field includes sensitive data such as PII) ; and in response to determining that the content stored in the at least one data field in each record in the subset of records comprises sensitive data, determine that the at least one data field is a sensitive data field (Butler: paragraphs 58-59 disclose the classification module classify the data fields using the profile data. Paragraphs 36-40 disclose classifying each field as having a data type and determining a label for the data field. Paragraph 44 reveals the label can indicate whether a particular field includes sensitive data such as PII). Motivation similar to the motivation presented in claim 18.
As to claim 20, the combination of Hodgman in view of Butler and Bezzi teaches wherein the classification of the at least one data field as sensitive data field (Hodgman: Figure 2, reference number 204 and paragraphs 49-50, see claim 18 above) is performed in a restricted computing environment requiring authorization to access the first datastore (Hodgman: paragraph 49 reveals classifier component is in the threat analysis system component 202. Paragraph 46 reveals the threat analysis system 202 receives user authentication data that is associated with an access policy that identifies access rights of a user, including access to one or more assets), and wherein determination based on the catalog is performed in an environment external to the restricted computing environment and has no access to the first datastore (Hodgman: Figure 2, reference 218 “query source”/data security component is external to the threat analysis system 202 that contains the classifier component 204. The query source includes include authorized users of a service provider and does not have direct access to the data source 104).
Claim(s) 6-13 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Hodgman et al US 20200053115 (hereinafter Hodgman), in view of Butler et al US 20200380212 (hereinafter Butler), in further view of Hoa US 20200293681 (hereinafter Hoa), in further view of Bezzi US 20150007249 (hereinafter Bezzi), and in further view of Bhargava et al US 20180307723 (hereinafter Bhargava).
As to claim 6, the combination of Hodgman in view of Butler and Bezzi teaches all the limitations recited in claim 1 above and further teaches further comprising: prior to accessing, the query, obtaining, by a sensitive data removing component executing on the one of more processor devices, the query from a query repository (Bezzi: paragraphs 28-29 disclose that prior to accessing[the result of ] the query, a processor generates/obtains a query to retrieve data from a data store. Paragraph 67 and Figure 5 also disclose prior to accessing the query by the system controller and privacy risk estimation module 144/data security component, obtaining the query by the query handler module 432/sensitive data removing module from the result set explorer module/query repository. Hodgman: paragraph 50 discloses interface obtain query from query source. Paragraph 62 discloses the can be an automated query or a manual query. An automated query can include system generated queries); parsing, by the sensitive data removing component the query (Bezzi: paragraph 67 reveals the query handler module/sensitive data removing component parses the query). It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to further modify the queries of Hodgman with the modification of the data from a first data source in Hodgman’s teachings of analyzing the data, classifying the data, and accessing a query and of Butler’s teachings of classifying and labeling data fields from a data source with Hoa’s teachings of determining whether query requested content from a sensitive data field to further include obtain and parse the query as taught by Bezzi to provide the capability of processing large volumes of data in real time and anonymizing the data as necessary in real time as a result of parsed query (paragraph 5 of Bezzi).
The combination of combination of Hodgman in view of Butler and Bezzi does not teach parsing the query to identify a data field name, a parameter, and a condition with respect to the data field name and the parameter that identifies records that match the query; removing, by the sensitive-data removing component, the parameter such that the query no longer includes the parameter; and sending, by the sensitive-data removing component, the query to the data security component.
Bhargava teaches parsing, by the sensitive data removing component, the query to identify a data field name, a parameter, and a condition with respect to the data field name and the parameter that identifies records that match the query (paragraphs 33-35, 73, 78, and 82 disclose middleware subsystem receives the request and parses the request. The request can also include fields containing a user identifier, a client identifier, a session identifier, an authorization token, a database table, or other identifier indicating which database, table, or other structure is the target of the client request, a target record count, a starting record identifier, or a mode flag/condition, and a filter/parameter. Paragraph 78 discloses the mode flag/condition is used to designate whether records should be returned or record identifiers/data field name. Paragraph 3 reveals the parameters include/identify a number of data records sought to be returned to the requesting client that match the query parameter); removing, by the sensitive-data removing component, the parameter such that the query no longer includes the parameter (paragraph 82 disclose the request can have the PII fields parameters removed/filtered, wherein the filter indicates removal of such fields . A similar request can have different filters or none at all) ; and sending, by the sensitive-data removing component, the query to the data security component (paragraphs 81-82 reveal the query/request can be issued by the middleware subsystem to an authorization service/data security component).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to further modify the queries of Hodgman with the modification of the data from a first data source in Hodgman’s teachings of analyzing the data, classifying the data, and accessing a query and of Butler’s teachings of classifying and labeling data fields from a data source with Hoa’s teachings of determining whether query requested content from a sensitive data field in view of include obtaining and parsing the query as taught by Bezzi with Bhargava’s teachings of processing the query to provide an improved method for managing data retrieval when runtime authorization is required (paragraph 1 of Bhargava).
As to claim 7, the combination of Hodgman in view of Butler, Hoa, Bezzi, and Bhargava teaches wherein the sensitive-data removing component executes in a restricted computing environment requiring authorization to access the query repository (Bhargava: Figure 1 and paragraphs 33-36 reveal the middleware subsystem is configured to access the one or more databases to obtain available data records or record identifiers in response to a client request, or authorized data records in response to authorization from an authorization service. Paragraph 40 discloses the middleware subsystem is hosted on distinct computer nodes that include processor and memory. The middleware subsystem is in a restricted environment because interactions with authorization services are managed by the authorization interface module of the middleware subsystem), and wherein the data security component executes in an environment external to the restricted computing environment and has no access to the query repository (Bhargava: Figure 1 and paragraphs 33-36 reveal the authorization service/identity service is external to the restricted environment of the middleware subsystem and does not have access to the direct query sent by the client 111, client adapter module112 which can be the query repository/source). Motivation is similar to the motivation presented in claim 6.
As to claim 8, the combination of Hodgman in view of Butler, Hoa, Bezzi, and Bhargava teaches wherein the query repository comprises a datastore log that stores queries made to the first datastore (Bhargava: paragraph 79 disclose the database record request can be initiated as a stored procedure call by a Java service within a middleware subsystem, which in turn calls a database query, such as a HANA query, within the stored procedure layer). Motivation is similar to the motivation presented in claim 6.
As to claim 9, the combination of Hodgman in view of Butler, Hoa, Bezzi, and Bhargava teaches wherein the query repository comprises a trace data structure in which, for each transaction made to an application comprising a plurality of microservices (Bhargava: paragraphs 34 and 46 disclose client adapter module/microservice of the middleware subsystem/application can be configured to receive request messages/transaction from one of the clients, which can represent requests for data records. During or following processing of a request message, client adapter module can be configured to respond to the client request with one or more batches of authorized data records. Client adapter module can also be configured to accumulate authorized data records in accumulator until a threshold number of records, such as a page length, has been reached. Authorized data records are not returned to the client immediately, but can be accumulated until a threshold number of authorized data records is reached, which can be more efficient for communications between the data retrieval system and clients. Paragraph 4 also disclose retrieval control adaptor keeps a record of the target result count), each respective microservice in a chain of microservices stores trace records that identify an upstream microservice that invoked the respective microservice and any queries made to the first datastore by the respective microservice (Bhargava: paragraph 54 discloses extra records can be stored in an accumulator in anticipation of a subsequent request, or they can be returned to the requesting client for management or optional caching at client-side). Motivation is similar to the motivation presented in claim 6.
As to claim 10, the combination of Hodgman in view of Butler, Hoa, Bezzi, and Bhargava further teaches obtaining, by a validating microservice of the plurality of microservices, a security token that identifies an entity that caused the query to be made to the first datastore (Bhargava: paragraphs 7 and 73 reveal the client request that is received by the middleware subsystem includes an authorization token and client identifier. The authorization request to the authorization service can include a user or client identifier, an authorization token, or other parameters or flags forwarded from the client request handled by the authorization interface module/validating microservice of the plurality of microservices depicted as reference number 112, 122, 114, 124, and 116 shown in Figure 1. Paragraph 82 reveals client or user identifier can be required in environments where access permissions are different for different users of clients. Hodgman: paragraphs 54 and 62 disclose threat analysis service receive user authentication/token that include user identifier. The user identifier is associated with an access policy that identifies access rights of a user, including access to one or more asset due to an access request); and sending, by the sensitive-data removing component to the data security component, at least a portion of the security token that identifies the entity (Bhargava: paragraphs 7 and 73 reveal the client request that is received by the middleware subsystem/sensitive data removing component includes an authorization token and client identifier. The authorization request to the authorization service/security component from the middleware subsystem can include a user or client identifier, an authorization token, or other parameters or flags forwarded from the client request handled by the authorization interface module). Motivation is similar to the motivation presented in claim 6.
As to claim 11, the combination of Hodgman in view of Butler, Hoa, Bezzi, and Bhargava further teaches further comprising: obtaining, by the sensitive-data removing component, a security token that identifies an entity that caused the query to be submitted the query to be made to the first datastore Bhargava: paragraphs 7 and 73 reveal the client request that is received by the middleware subsystem includes an authorization token and client identifier. The authorization request to the authorization service can include a user or client identifier, an authorization token, or other parameters or flags forwarded from the client request handled by the authorization interface module/validating microservice of the plurality of microservices depicted as reference number 112, 122, 114, 124, and 116 shown in Figure 1. Paragraph 82 reveals client or user identifier can be required in environments where access permissions are different for different users of clients. Hodgman: paragraphs 54 and 62 disclose threat analysis service receive user authentication/token that include user identifier. The user identifier is associated with an access policy that identifies access rights of a user, including access to one or more asset due to an access request); and sending, by the sensitive-data removing component to the data security component, at least a portion of the security token that identifies the entity (Bhargava: paragraphs 7 and 73 reveal the client request that is received by the middleware subsystem/sensitive data removing component includes an authorization token and client identifier. The authorization request to the authorization service/security component from the middleware subsystem can include a user or client identifier, an authorization token, or other parameters or flags forwarded from the client request handled by the authorization interface module). Motivation is similar to the motivation presented in claim 6.
As to claim 12, the combination of Hodgman in view of Butler, Hoa, Bezzi, and Bhargava further teaches wherein the at least the portion of the security token identifies an individual (Bhargava: paragraphs 7 and 73 reveal the client request that is received by the middleware subsystem includes an authorization token and client identifier. The authorization request to the authorization service can include a user or client identifier, an authorization token. Paragraph 82 reveals client or user identifier can be required in environments where access permissions are different for different users of clients. Hodgman: paragraphs 54 and 62 disclose threat analysis service receive user authentication/token that include user identifier. The user identifier is associated with an access policy that identifies access rights of a user, including access to one or more asset due to an access request). Motivation is similar to the motivation presented in claim 6.
As to claim 13, the combination of Hodgman in view of Butler, Hoa, Bezzi teaches all the limitations recited in claim 1 above and further teaches receiving a user input selection of the first datastore (Hodgman: paragraphs 6 and 48 disclose receiving a query associated with a subject from a data source. The user can input/submit the query for access to data in a data source. Hoa: paragraph 22 also disclose receiving via interface engine request by an entity to access data from a personnel database); and sending, to the classifier component, the instruction to determine whether any data fields in the first datastore should be classified as sensitive data fields in which sensitive data is stored (Butler: paragraph 44 disclose the classification module has instructions to implement a label index to indicate whether a particular field of the dataset includes sensitive information/PII. Hoa: paragraphs 25 and 30 disclose the security engine has instructions to determine whether requested data from the database is sensitive or non-sensitive data). Motivation is similar to the motivation presented in claim 1.
The combination of Hodgman in view of Butler, Hoa, Bezzi does not teach, but Bhargava teaches causing user interface imagery that identifies a plurality of datastores associated with an entity, including the first datastore to be presented on a display device (paragraphs 22 and 58 disclose the responses to the client request/ returned records can be provided one display page at a time according to the batch size. The returned records can be displayed on a user interface. Paragraph 34 discloses client adapter module can be configured to receive request messages from one of the clients , which can represent requests for data records. During or following processing of a request message, client adapter module can be configured to respond to the client request with one or more batches of authorized data records. Client adapter module can also be configured to accumulate authorized data records in accumulator until a threshold number of records, such as a page length, has been reached. The page length can denote a maximum number of data records that can be accommodated on a display page of the requesting client's display device).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to further modify the queries of Hodgman with the modification of the data from a first data source in Hodgman’s teachings of analyzing the data, classifying the data, and accessing a query and of Butler’s teachings of classifying and labeling data fields from a data source with Hoa’s teachings of determining whether query requested content from a sensitive data field in view of include obtaining and parsing the query as taught by Bezzi with Bhargava’s teachings of displaying the query result to provide an improved method for managing data retrieval when runtime authorization is required (paragraph 1 of Bhargava).
As to claim 17, the combination of Hodgman in view of Butler and Bezzi teaches all the limitations recited in claim 14 above and further teaches further comprising: prior to accessing, the query, obtain, by a sensitive data removing component executing on the one of more processor devices, the query from a query repository (Bezzi: paragraphs 28-29 disclose that prior to accessing[the result of ] the query, a processor generates/obtains a query to retrieve data from a data store. Paragraph 67 and Figure 5 also disclose prior to accessing the query by the system controller and privacy risk estimation module 144/data security component, obtaining the query by the query handler module 432/sensitive data removing module from the result set explorer module/query repository. Hodgman: paragraph 50 discloses interface obtain query from query source. Paragraph 62 discloses the can be an automated query or a manual query. An automated query can include system generated queries); parse the query (Bezzi: paragraph 67 reveals the query handler module/sensitive data removing component parses the query). It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to further modify the queries of Hodgman with the modification of the data from a first data source in Hodgman’s teachings of analyzing the data, classifying the data, and accessing a query and of Butler’s teachings of classifying and labeling data fields from a data source with Hoa’s teachings of determining whether query requested content from a sensitive data field to further include obtain and parse the query as taught by Bezzi to provide the capability of processing large volumes of data in real time and anonymizing the data as necessary in real time as a result of parsed query (paragraph 5 of Bezzi).
The combination of combination of Hodgman in view of Butler and Bezzi does not teach parse the query to identify a data field name, a parameter, and a condition with respect to the data field name and the parameter that identifies records that match the query; remove the parameter such that the query no longer includes the parameter; and send the query to the data security component.
Bhargava teaches parsing, by the sensitive data removing component, the query to identify a data field name, a parameter, and a condition with respect to the data field name and the parameter that identifies records that match the query (paragraphs 33-35, 73, 78, and 82 disclose middleware subsystem receives the request and parses the request. The request can also include fields containing a user identifier, a client identifier, a session identifier, an authorization token, a database table, or other identifier indicating which database, table, or other structure is the target of the client request, a target record count, a starting record identifier, or a mode flag/condition, and a filter/parameter. Paragraph 78 discloses the mode flag/condition is used to designate whether records should be returned or record identifiers/data field name. Paragraph 3 reveals the parameters include/identify a number of data records sought to be returned to the requesting client that match the query parameter); removing, by the sensitive-data removing component, the parameter such that the query no longer includes the parameter (paragraph 82 disclose a request for sanitized records having PII fields removed/filter, wherein the filter indicates removal of such fields . A similar request can have different filters or none at all) ; and sending, by the sensitive-data removing component, the query to the data security component (paragraphs 81-82 reveal the query/request can be issued by the middleware subsystem to an authorization service/data security component).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to further modify the queries of Hodgman with the modification of the data from a first data source in Hodgman’s teachings of analyzing the data, classifying the data, and accessing a query and of Butler’s teachings of classifying and labeling data fields from a data source with Hoa’s teachings of determining whether query requested content from a sensitive data field in view of include obtaining and parsing the query as taught by Bezzi with Bhargava’s teachings of processing the query to provide an improved method for managing data retrieval when runtime authorization is required (paragraph 1 of Bhargava).
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FELICIA FARROW whose telephone number is (571)272-1856. The examiner can normally be reached M - F 7:30am-4:00pm (EST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexander Lagor can be reached at (571)270-5143. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/F.F/Examiner, Art Unit 2437
/ALEXANDER LAGOR/Supervisory Patent Examiner, Art Unit 2437