DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of Claims
The following is a non-final office action.
Claims 1-20 are currently pending and have been examined on their merits.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1: Claims 1-20 recite a system and therefore each claim falls within one of the four statutory categories.
Step 2A prong 1 (Is a judicial exception recited?):
The representative claim 1 recites receive, a dataset associated with a project; if the dataset comprises at least one of protected health information (PHI) or personal identifiable information (PII), pseudonymize or deidentify the dataset to generate a cohort; if the dataset does not comprise any of PHI and PII, generate the cohort as a direct copy of the dataset; and store the cohort.
Claim 11 recites receive a schema definition; receive, a selection of a cohort of a plurality of cohorts associated with the at least one client agent, wherein each cohort was generated by accessing a respective dataset and pseudonymizing or deidentifying the dataset if the dataset comprises at least one of protected health information (PHI) or personal identifiable information (PII).
Claim 18 recites receive processing instructions; instruct to perform the processing instructions on the associated datasets; and receive an output from the processing instructions.
The claims recite a mental process as a person could mentally, or with simple tools such as pen and paper, perform the claimed limitations of receiving a cohort of project data, processing the dataset based on instructions by performing actions such as determine if the data contains PHI or PII data and if so pseudonymize or deidentify the data, and generating an output. The claims merely recite a process for performing instructions to receive and analyze datasets and process the data to generate an output based on instructions. The claims are found to recite a mental process as the claims limitations recite concepts the courts have defined as being performed in the human mind such as observations, evaluations, judgements, and opinions. Additionally, the courts have identified similar claims of collecting information, analyzing it, and displaying certain results of the collection and analysis as being processes capable of being performed in the human mind or by using basic tools. Therefore, the claims recite a mental process.
Alternatively, the claims recite a certain method of organizing human activity, as the claims recite managing personal behavior or relationships or interactions between people. The claims merely recite a series of instructions or steps to receive and process information such as receiving a dataset and performing instructions to pseudonymize or deidentifying the dataset to generate an output. Merely reciting steps to interact with and processes data in a dataset to generate an output based on a series of instructions are a certain method of organizing human activity.
Step 2A Prong 2 (Is the exception integrated into a practical application?): The claims additionally recite;
Claim 1: a distributed computing system comprising: a client agent that resides on a network and is communicably coupled to a central server that resides outside of the network, the client agent comprising instructions which, when executed by one or more processors, cause the client agent to perform a process operable to: a workstation on the network, maintained by the central server, store in a database on the network; wherein the client agent is configured to perform compute tasks on the cohort.
Claim 11: a system for providing flexible distributed computation comprising: a server accessible by at least one client agent, the at least one client agent residing on a respective network associated with at least one site; wherein the server comprises instructions which, when executed by one or more processors, cause the server to perform a process operable to: from a user device; receive a container from the user device, the container comprising code to be executed, a respective network, and send a request to a client agent associated with the selected cohort; wherein the client agent pulls an image of the container and executes the code on the selected cohort.
Claim 18: a system for providing flexible distributed computation comprising: a plurality of client agents, each client agent residing on a respective network associated with a respective site and being configured to access an associated dataset; and one or more servers communicably coupled to the plurality of client agents, wherein each of the one or more servers comprises instructions which, when executed by one or more processors, cause the one or more servers to perform a process operable to: a user device, one or more of the plurality of client agents.
However, the limitations merely amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f). Merely utilizing a computer system to perform basic actions such as receiving and processing information based on a series of instructions and storing a result is not an improvement to a technology. Furthermore, a method for transmitting, receiving, and processing information does not amount to improvements to the functioning of a computer, or to any other technology or technical field, as discussed in MPEP 2106.05(a), or applying or using the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort designed to monopolize the exception, as discussed in MPEP 2106.05(e). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
Step 2B (Does the claim recite additional elements that amount to significantly more that the judicial exception?):
As discussed above, the additional imitations amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea of receiving and processing a dataset according to a series of instructions, see reasoning for Step 2A prong 2. Therefore, the additional elements do not amount to significantly more as they do not impose any meaningful limitations on the abstract idea.
Claims 2-4, 6-8, 12-13, 16-17, and 19-20 are directed to further narrowing the abstract idea of processing data to generate a cohort of data according to a series of instructions as disclosed in independent claims 1, 11, and 18.
The dependent claims recite the following additional elements:
Claim 5: wherein the client agent comprises at least one of a cloud-based server in a virtual private cloud, an on-site provisioned virtual machine, or an on-site server with access to data in a network and compute processing devices including one or more of CPUs or GPUs.
Claim 9: wherein executing the decrypted code comprises executing the decrypted code on at least one of a central processing unit (CPU) or a graphics processing unit (GPU) or in a Trusted Execution Environment.
Claim 10: transmit aggregate output statistics or execution results to the central server.
Claim 14: wherein the code key is provided to the client agent via an external key management system.
However, these elements are directed to merely “apply it” or applying a known technology to perform the abstract idea.
Therefore, claims 1-20 are rejected under U.S.C. 101.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s) 11-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Kho (US 2017/0208041).
Claim 11: Kho discloses a system for providing flexible distributed computation comprising: a server accessible by at least one client agent, the at least one client agent residing on a respective network associated with at least one site (Paragraph [0003]; [0011-0012]; [0014-0015]; Fig. 1, aspects of the present disclosure include methods, systems, and computer readable mediums for dynamically de-identifying data from a data source. The methods include a remote device deployed in a first security domain of a communication network, the remote device for generating a key value corresponding to a project requiring de-identification of data. The methods further involve a first server located in a second security domain of a communication network. The first server is configured to: securely receive the key value from the remote device and in response to receiving the key value: access a dataset maintained at the first server, the dataset including at least one individual record that uniquely identifies an individual. The first server is further configured to de-identify the dataset so that the at least one individual record in the dataset no longer uniquely identifies the individual. One or more client devices may securely transmit confidential patient health data (e.g. PHI and/or PII) to the de-identification and anonymous linkage management system (DALMS). For example, if a user were interested in de-identifying data for a particular project a user may interact with the one or more client devices to provide data including a seed value and/or key value that identifies the relevant project, and which may be processed by the de-identification module);
wherein the server comprises instructions which, when executed by one or more processors, cause the server to perform a process operable to: receive a schema definition from a user device (Paragraph [0003]; [0011-0012]; [0014-0015]; Fig. 1, aspects of the present disclosure include methods, systems, and computer readable mediums for dynamically de-identifying data from a data source. One or more client devices may securely transmit confidential patient health data (e.g. PHI and/or PII) to the de-identification and anonymous linkage management system (DALMS). For example, if a user were interested in de-identifying data for a particular project a user may interact with the one or more client devices to provide data including a seed value and/or key value that identifies the relevant project, and which may be processed by the de-identification module);
receive a container from the user device, the container comprising code to be executed (Paragraph [0003]; [0015]; [0018-0020]; Fig. 2, the methods, systems, and computer readable mediums further involve a first server located in a domain of the communication network. The first server is configured to: securely receive the kay value from the remote device and in response to the receiving the key value: access a dataset maintained at the first server. In some embodiments, a graphical user interface may be generated and/or initialized at the one or more client devices that may be employed to deliver an encrypted, project based, seed/key value to the DALMS. For example, a graphical user interface or command line application may be initialized at the client devices that allows a user to request the seed/key value assigned to particular project, and for which a user is approved to access);
receive, from the user device, a selection of a cohort of a plurality of cohorts associated with the at least one client agent, wherein each cohort was generated by the at least one client agent accessing a respective dataset within a respective network and pseudonymizing or deidentifying the dataset if the dataset comprises at least one of protected health information (PHI) or personal identifiable information (PII) (Paragraph [0003]; [0011-0012]; [0014-0015]; Fig. 1, aspects of the present disclosure include methods, systems, and computer readable mediums for dynamically de-identifying data from a data source. The methods include a remote device deployed in a first security domain of a communication network, the remote device for generating a key value corresponding to a project requiring de-identification of data. The methods further involve a first server located in a second security domain of a communication network. The first server is configured to: securely receive the key value from the remote device and in response to receiving the key value: access a dataset maintained at the first server, the dataset including at least one individual record that uniquely identifies an individual. The first server is further configured to de-identify the dataset so that the at least one individual record in the dataset no longer uniquely identifies the individual. One or more client devices may securely transmit confidential patient health data (e.g. PHI and/or PII) to the de-identification and anonymous linkage management system (DALMS). For example, if a user were interested in de-identifying data for a particular project a user may interact with the one or more client devices to provide data including a seed value and/or key value that identifies the relevant project, and which may be processed by the de-identification module);
and send a request to a client agent associated with the selected cohort; wherein the client agent pulls an image of the container and executes the code on the selected cohort (Paragraph [0003]; [0015]; [0018-0020]; Fig. 2, the methods, systems, and computer readable mediums further involve a first server located in a domain of the communication network. The first server is configured to: securely receive the kay value from the remote device and in response to the receiving the key value: access a dataset maintained at the first server, the dataset including at least one individual record. The first server is further configured to de-identify the dataset so that the at least one individual record in the dataset no longer uniquely identifies the individual. In some embodiments, a graphical user interface may be generated and/or initialized at the one or more client devices that may be employed to deliver an encrypted, project based, seed/key value to the DALMS. For example, a graphical user interface or command line application may be initialized at the client devices that allows a user to request the seed/key value assigned to particular project, and for which a user is approved to access).
Claim 12: Kho discloses the system as per claim 11. Kho further discloses wherein an output of the executed code comprises a new cohort for each input cohort, a set of new cohorts, or a set of data points or statistics that result from the code execution on each input cohort (Paragraph [0003]; [0011-0012]; [0014-0015]; [0019-0020]; Fig. 1, aspects of the present disclosure include methods, systems, and computer readable mediums for dynamically de-identifying data from a data source. The methods include a remote device deployed in a first security domain of a communication network, the remote device for generating a key value corresponding to a project requiring de-identification of data. The methods further involve a first server located in a second security domain of a communication network. The first server is configured to: securely receive the key value from the remote device and in response to receiving the key value: access a dataset maintained at the first server, the dataset including at least one individual record that uniquely identifies an individual. The first server is further configured to de-identify the dataset so that the at least one individual record in the dataset no longer uniquely identifies the individual. One or more client devices may securely transmit confidential patient health data (e.g. PHI and/or PII) to the de-identification and anonymous linkage management system (DALMS). For example, if a user were interested in de-identifying data for a particular project a user may interact with the one or more client devices to provide data including a seed value and/or key value that identifies the relevant project, and which may be processed by the de-identification module. In some embodiments, a graphical user interface may be generated and/or initialized at the one or more client devices that may be employed to deliver an encrypted, project-based, seed/key value to DALMS. For example, a graphical user interface or command line application may be initialized at the client devices that allows a user to request the seed/key value assigned to a particular project. In response, the graphical user interface web portal may encrypt the seed/key value and transmit the seed/key value over a secured protocol to de-identification application at source client end device).
Claim 13: Kho discloses the system as per claim 11. Kho further discloses wherein receiving the container from the user device comprises receiving an encrypted container, wherein the client agent decrypts the container with a code key (Paragraph [0017]; [0021-0022]; Fig. 2, in some embodiments the security domain represents separate sets and/or networks of computing components and cannot be accessed without proper authentication and/or encryption/decryption. In response the graphical user interface web portal may encrypt the seed/key value and transmit the seed/key value over a secured protocol to de-identification application as source/client end device. For example and in one embodiment, the seed value and/or key value may be encrypted using secure hash algorithms. The encrypted seed value and/or key value is securely transmitted to the DALMS. The seed/key value is decrypted and used to process patient health data. The DALMS processes the seed value in combination with other PHI/PII elements to generate hashes via cryptographic algorithms).
Claim 14: Kho discloses the system as per claim 13. Kho further discloses wherein the code key is provided to the client agent via an external key management system (Paragraph [0017]; [0021-0022]; Fig. 2, in some embodiments the security domain represents separate sets and/or networks of computing components and cannot be accessed without proper authentication and/or encryption/decryption. In response the graphical user interface web portal may encrypt the seed/key value and transmit the seed/key value over a secured protocol to de-identification application as source/client end device. For example and in one embodiment, the seed value and/or key value may be encrypted using secure hash algorithms. The encrypted seed value and/or key value is securely transmitted to the DALMS. The seed/key value is decrypted and used to process patient health data. The DALMS processes the seed value in combination with other PHI/PII elements to generate hashes via cryptographic algorithms).
Claim 15: Kho discloses the system as per claim 11. Kho further discloses wherein the process is further operable to: receive a schema definition from the user device; and provide the schema definition to the at least one client agent to validate the dataset (Paragraph [0003]; [0011-0012]; [0014-0015]; [0019-0020]; Fig. 1, aspects of the present disclosure include methods, systems, and computer readable mediums for dynamically de-identifying data from a data source. The methods include a remote device deployed in a first security domain of a communication network, the remote device for generating a key value corresponding to a project requiring de-identification of data. The methods further involve a first server located in a second security domain of a communication network. The first server is configured to: securely receive the key value from the remote device and in response to receiving the key value: access a dataset maintained at the first server, the dataset including at least one individual record that uniquely identifies an individual. The first server is further configured to de-identify the dataset so that the at least one individual record in the dataset no longer uniquely identifies the individual. One or more client devices may securely transmit confidential patient health data (e.g. PHI and/or PII) to the de-identification and anonymous linkage management system (DALMS). For example, if a user were interested in de-identifying data for a particular project a user may interact with the one or more client devices to provide data including a seed value and/or key value that identifies the relevant project, and which may be processed by the de-identification module. In some embodiments, a graphical user interface may be generated and/or initialized at the one or more client devices that may be employed to deliver an encrypted, project-based, seed/key value to DALMS. For example, a graphical user interface or command line application may be initialized at the client devices that allows a user to request the seed/key value assigned to a particular project. In response, the graphical user interface web portal may encrypt the seed/key value and transmit the seed/key value over a secured protocol to de-identification application at source client end device).
Claim 16: Kho discloses the system as per claim 11. Kho further discloses wherein the process is further operable to: receive a project permission configuration from the user device, the configuration comprising one or more data permissions for one or more collaborators; and enforce the permission configuration (Paragraph [0125]; [0129]; [0174]; Fig. 3, a policy manager may be configured to manage one or more policy agents. A policy may dictate that a data scientist may receive an outputted dataset enclosed in a secure enclave. This means the data in the dataset is non-transparent to the data scientist. The latter is free to run additional output requests on the outputted dataset in the enclave by injecting new requests into the enclave. In those cases, when the outputted dataset does not have any PII data or does not violate the privacy parameters constraint, the dataset may become unconstrained and may be made available to the data scientist. One way of dealing with healthcare data is to anonymize or mask the private data attributes, e.g., mask social security numbers. In some embodiments methods may be employed for masking and de-identifying personal information from healthcare records. Using these methods, a dataset containing healthcare records may have various portions of its data attributes masked or de-identified).
Claim 17: Kho discloses the system as per claim 15. Kho further discloses wherein the process is further operable to receive an updated schema definition from the user device (Paragraph [0003]; [0011-0012]; [0014-0015]; [0019-0020]; Fig. 1, aspects of the present disclosure include methods, systems, and computer readable mediums for dynamically de-identifying data from a data source. The methods include a remote device deployed in a first security domain of a communication network, the remote device for generating a key value corresponding to a project requiring de-identification of data. The methods further involve a first server located in a second security domain of a communication network. The first server is configured to: securely receive the key value from the remote device and in response to receiving the key value: access a dataset maintained at the first server, the dataset including at least one individual record that uniquely identifies an individual. The first server is further configured to de-identify the dataset so that the at least one individual record in the dataset no longer uniquely identifies the individual. One or more client devices may securely transmit confidential patient health data (e.g. PHI and/or PII) to the de-identification and anonymous linkage management system (DALMS). For example, if a user were interested in de-identifying data for a particular project a user may interact with the one or more client devices to provide data including a seed value and/or key value that identifies the relevant project, and which may be processed by the de-identification module. In some embodiments, a graphical user interface may be generated and/or initialized at the one or more client devices that may be employed to deliver an encrypted, project-based, seed/key value to DALMS. For example, a graphical user interface or command line application may be initialized at the client devices that allows a user to request the seed/key value assigned to a particular project. In response, the graphical user interface web portal may encrypt the seed/key value and transmit the seed/key value over a secured protocol to de-identification application at source client end device).
Claim 18: Kho discloses a system for providing flexible distributed computation comprising: a plurality of client agents, each client agent residing on a respective network associated with a respective site and being configured to access an associated dataset (Paragraph [0003]; [0011-0012]; [0014-0015]; Fig. 1, aspects of the present disclosure include methods, systems, and computer readable mediums for dynamically de-identifying data from a data source. The methods include a remote device deployed in a first security domain of a communication network, the remote device for generating a key value corresponding to a project requiring de-identification of data. The methods further involve a first server located in a second security domain of a communication network. The first server is configured to: securely receive the key value from the remote device and in response to receiving the key value: access a dataset maintained at the first server, the dataset including at least one individual record that uniquely identifies an individual. The first server is further configured to de-identify the dataset so that the at least one individual record in the dataset no longer uniquely identifies the individual. One or more client devices may securely transmit confidential patient health data (e.g. PHI and/or PII) to the de-identification and anonymous linkage management system (DALMS). For example, if a user were interested in de-identifying data for a particular project a user may interact with the one or more client devices to provide data including a seed value and/or key value that identifies the relevant project, and which may be processed by the de-identification module);
and one or more servers communicably coupled to the plurality of client agents, wherein each of the one or more servers comprises instructions which, when executed by one or more processors, cause the one or more servers to perform a process operable to: receive processing instructions from a user device (Paragraph [0033]; [0039]; Fig. 5, an example of a suitable computing and networking environment that may be used to implement various aspects of the present disclosure. The computing and networking environment includes a general-purpose computing device, the networking environment may include one or more other computing systems, such as personal computers, server computers, hand-help or laptop devices, and the like. The computer may operate in a networked or cloud-computing environment using logical connections of a network interface or adapter to one or more remote devices. The remote computer may be a personal computer, a server, a router, a network PC, and typically includes many or all of the elements relative described relative to the computer);
instruct one or more of the plurality of client agents to perform the processing instructions on the associated datasets (Paragraph [0015]; [0018-0020]; Fig. 2, in some embodiments, a graphical user interface may be generated and/or initialized at the one or more client devices that may be employed to deliver an encrypted, project based, seed/key value to the DALMS. For example, a graphical user interface or command line application may be initialized at the client devices that allows a user to request the seed/key value assigned to particular project, and for which a user is approved to access);
and receive an output from each of the client agents that performed the processing instructions (Paragraph [0003]; [0011-0012]; [0014-0015]; [0032]; Fig. 1, aspects of the present disclosure include methods, systems, and computer readable mediums for dynamically de-identifying data from a data source. The methods include a remote device deployed in a first security domain of a communication network, the remote device for generating a key value corresponding to a project requiring de-identification of data. The methods further involve a first server located in a second security domain of a communication network. The first server is configured to: securely receive the key value from the remote device and in response to receiving the key value: access a dataset maintained at the first server, the dataset including at least one individual record that uniquely identifies an individual. The first server is further configured to de-identify the dataset so that the at least one individual record in the dataset no longer uniquely identifies the individual. One or more client devices may securely transmit confidential patient health data (e.g. PHI and/or PII) to the de-identification and anonymous linkage management system (DALMS). For example, if a user were interested in de-identifying data for a particular project a user may interact with the one or more client devices to provide data including a seed value and/or key value that identifies the relevant project, and which may be processed by the de-identification module. All combined dataset are provided with an associated “universal” identifier that uniquely identifies the hashed data sets as a single set of identifiable health patient data that corresponds to one individual, without actually identifying a unique individual healthcare patient).
Claim 19: Kho discloses the system as per claim 18. Kho further discloses wherein the process is further operable to encrypt the output from each of the client agents that performed the processing instructions (Paragraph [0017]; [0021-0022]; Fig. 2, in some embodiments the security domain represents separate sets and/or networks of computing components and cannot be accessed without proper authentication and/or encryption/decryption. In response the graphical user interface web portal may encrypt the seed/key value and transmit the seed/key value over a secured protocol to de-identification application as source/client end device. For example and in one embodiment, the seed value and/or key value may be encrypted using secure hash algorithms. The encrypted seed value and/or key value is securely transmitted to the DALMS. The seed/key value is decrypted and used to process patient health data. The DALMS processes the seed value in combination with other PHI/PII elements to generate hashes via cryptographic algorithms).
Claim 20: Kho discloses the system as per claim 19. Kho further discloses wherein encrypting the output from each of the client agents comprises performing a homomorphic encryption process (Paragraph [0017]; [0021-0022]; Fig. 2, in some embodiments the security domain represents separate sets and/or networks of computing components and cannot be accessed without proper authentication and/or encryption/decryption. In response the graphical user interface web portal may encrypt the seed/key value and transmit the seed/key value over a secured protocol to de-identification application as source/client end device. For example and in one embodiment, the seed value and/or key value may be encrypted using secure hash algorithms. The encrypted seed value and/or key value is securely transmitted to the DALMS. The seed/key value is decrypted and used to process patient health data. The DALMS processes the seed value in combination with other PHI/PII elements to generate hashes via cryptographic algorithms).
Therefore, claims 11-20 are rejected under U.S.C. 102.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious before the effective filing date of the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or non-obviousness.
Claims 1-10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kho (US 2017/0208041) in view of Ardhanari (US 2021/0248268).
Claim 1: Kho discloses a distributed computing system comprising: a client agent that resides on a network and is communicably coupled to a central server that resides outside of the network, the client agent comprising instructions which, when executed by one or more processors, cause the client agent to perform a process operable to: receive, from a workstation on the network, a dataset associated with a project maintained by the central server (Paragraph [0003]; [0011-0012]; [0014-0015]; Fig. 1, aspects of the present disclosure include methods, systems, and computer readable mediums for dynamically de-identifying data from a data source. The methods include a remote device deployed in a first security domain of a communication network, the remote device for generating a key value corresponding to a project requiring de-identification of data. The methods further involve a first server located in a second security domain of a communication network. The first server is configured to: securely receive the key value from the remote device and in response to receiving the key value: access a dataset maintained at the first server, the dataset including at least one individual record that uniquely identifies an individual. The first server is further configured to de-identify the dataset so that the at least one individual record in the dataset no longer uniquely identifies the individual. One or more client devices may securely transmit confidential patient health data (e.g. PHI and/or PII) to the de-identification and anonymous linkage management system (DALMS). For example, if a user were interested in de-identifying data for a particular project a user may interact with the one or more client devices to provide data including a seed value and/or key value that identifies the relevant project, and which may be processed by the de-identification module);
if the dataset comprises at least one of protected health information (PHI) or personal identifiable information (PII), pseudonymize or deidentify the dataset to generate a cohort (Paragraph [0003]; [0011-0012]; [0014-0015]; [0032]; Fig. 1, aspects of the present disclosure include methods, systems, and computer readable mediums for dynamically de-identifying data from a data source. The methods include a remote device deployed in a first security domain of a communication network, the remote device for generating a key value corresponding to a project requiring de-identification of data. The methods further involve a first server located in a second security domain of a communication network. The first server is configured to: securely receive the key value from the remote device and in response to receiving the key value: access a dataset maintained at the first server, the dataset including at least one individual record that uniquely identifies an individual. The first server is further configured to de-identify the dataset so that the at least one individual record in the dataset no longer uniquely identifies the individual. One or more client devices may securely transmit confidential patient health data (e.g. PHI and/or PII) to the de-identification and anonymous linkage management system (DALMS). For example, if a user were interested in de-identifying data for a particular project a user may interact with the one or more client devices to provide data including a seed value and/or key value that identifies the relevant project, and which may be processed by the de-identification module. All combined dataset are provided with an associated “universal” identifier that uniquely identifies the hashed data sets as a single set of identifiable health patient data that corresponds to one individual, without actually identifying a unique individual healthcare patient);
and store the cohort in a database on the network (Paragraph [0016] in some embodiments, the DALMS may communicate with an anonymous linker that may be located within its own security domain. The anonymous linker may employ a generator to link patient health data that has been newly de-identified by the DALMS with patient health data that was previously de-identified and currently being maintained by the anonymous linker, such as for example in a database);
wherein the client agent is configured to perform compute tasks on the cohort (Paragraph [0015]; [0018-0020]; Fig. 2, in some embodiments, a graphical user interface may be generated and/or initialized at the one or more client devices that may be employed to deliver an encrypted, project based, seed/key value to the DALMS. For example, a graphical user interface or command line application may be initialized at the client devices that allows a user to request the seed/key value assigned to particular project, and for which a user is approved to access).
Kho discloses a system of de-identifying personal data for a project such as PHI and PII information found in a dataset when generating a cohort. However, Kho does not specifically disclose the following claim limitations: if the dataset does not comprise any of PHI and PII, generate the cohort as a direct copy of the dataset.
In the same field of endeavor of protecting private data during a project Ardhanari teaches if the dataset does not comprise any of PHI and PII, generate the cohort as a direct copy of the dataset (Paragraph [0125]; [0129]; [0174]; Fig. 3, a policy manager may be configured to manage one or more policy agents. A policy may dictate that a data scientist may receive an outputted dataset enclosed in a secure enclave. This means the data in the dataset is non-transparent to the data scientist. The latter is free to run additional output requests on the outputted dataset in the enclave by injecting new requests into the enclave. In those cases, when the outputted dataset does not have any PII data or does not violate the privacy parameters constraint, the dataset may become unconstrained and may be made available to the data scientist. One way of dealing with healthcare data is to anonymize or mask the private data attributes, e.g., mask social security numbers. In some embodiments methods may be employed for masking and de-identifying personal information from healthcare records. Using these methods, a dataset containing healthcare records may have various portions of its data attributes masked or de-identified).
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to modify the system of de-identifying personal data in a data set to be used in a project as disclosed by Kho (Kho [0011]) with the system of if the dataset does not comprise any of PHI and PII, generate the cohort as a direct copy of the dataset as taught by Ardhanari (Ardhanari Fig. 3). With the motivation of being obvious to try as Kho discloses a system of generating a cohort of data for a project by determining if the data contains PHI or PII information and performing deidentification processes on the information determined to be private while non-private information would not need to be processes through the deidentification processes. Additionally, it would help to improve analyzing biomedical data while maintaining privacy of individual patients (Ardhanari [0005]).
Claim 2: Modified Kho discloses the distributed computing system as per claim 1. Kho further discloses wherein the process is further operable to validate a format of the dataset according to a schema associated with the project (Paragraph [0003]; [0011-0012]; [0014-0016]; [0020]; Fig. 1, aspects of the present disclosure include methods, systems, and computer readable mediums for dynamically de-identifying data from a data source. The methods include a remote device deployed in a first security domain of a communication network, the remote device for generating a key value corresponding to a project requiring de-identification of data. The methods further involve a first server located in a second security domain of a communication network. The first server is configured to: securely receive the key value from the remote device and in response to receiving the key value: access a dataset maintained at the first server, the dataset including at least one individual record that uniquely identifies an individual. The first server is further configured to de-identify the dataset so that the at least one individual record in the dataset no longer uniquely identifies the individual. Maintained in the database of the anonymous linker is patient health data that has already been de-identified. When new data is de-identified by the DALMS the newly de-identified data is combined and/or otherwise matched with the already existing de-identified data to ensure there is no duplicate data. A unique “universal” identifier is then provided for the matched de-identified data, thereby uniquely identifying the data without actually identifying a specific health care patient. A user may select or otherwise identify a project and generate a seed. Each seed or key is unique to each project and each user needs to be approved and assigned to a project before using the de-identification application. A user may provide project details to the command line application for requesting the relevant seed/key).
Claim 3: Modified Kho discloses the distributed computing system as per claim 2. Kho further discloses wherein the schema is a pre-defined schema (Paragraph [0003]; [0011-0012]; [0014-0015]; [0019-0020]; Fig. 1, aspects of the present disclosure include methods, systems, and computer readable mediums for dynamically de-identifying data from a data source. The methods include a remote device deployed in a first security domain of a communication network, the remote device for generating a key value corresponding to a project requiring de-identification of data. The methods further involve a first server located in a second security domain of a communication network. The first server is configured to: securely receive the key value from the remote device and in response to receiving the key value: access a dataset maintained at the first server, the dataset including at least one individual record that uniquely identifies an individual. The first server is further configured to de-identify the dataset so that the at least one individual record in the dataset no longer uniquely identifies the individual. One or more client devices may securely transmit confidential patient health data (e.g. PHI and/or PII) to the de-identification and anonymous linkage management system (DALMS). For example, if a user were interested in de-identifying data for a particular project a user may interact with the one or more client devices to provide data including a seed value and/or key value that identifies the relevant project, and which may be processed by the de-identification module. In some embodiments, a graphical user interface may be generated and/or initialized at the one or more client devices that may be employed to deliver an encrypted, project-based, seed/key value to DALMS. For example, a graphical user interface or command line application may be initialized at the client devices that allows a user to request the seed/key value assigned to a particular project. In response, the graphical user interface web portal may encrypt the seed/key value and transmit the seed/key value over a secured protocol to de-identification application at source client end device).
Claim 4: Modified Kho discloses the distributed computing system as per claim 2. Kho further discloses wherein the schema comprises a schema definition received from a user device, generated by a project lead, or derived from the dataset (Paragraph [0003]; [0011-0012]; [0014-0015]; [0019-0020]; Fig. 1, aspects of the present disclosure include methods, systems, and computer readable mediums for dynamically de-identifying data from a data source. The methods include a remote device deployed in a first security domain of a communication network, the remote device for generating a key value corresponding to a project requiring de-identification of data. The methods further involve a first server located in a second security domain of a communication network. The first server is configured to: securely receive the key value from the remote device and in response to receiving the key value: access a dataset maintained at the first server, the dataset including at least one individual record that uniquely identifies an individual. The first server is further configured to de-identify the dataset so that the at least one individual record in the dataset no longer uniquely identifies the individual. One or more client devices may securely transmit confidential patient health data (e.g. PHI and/or PII) to the de-identification and anonymous linkage management system (DALMS). For example, if a user were interested in de-identifying data for a particular project a user may interact with the one or more client devices to provide data including a seed value and/or key value that identifies the relevant project, and which may be processed by the de-identification module. In some embodiments, a graphical user interface may be generated and/or initialized at the one or more client devices that may be employed to deliver an encrypted, project-based, seed/key value to DALMS. For example, a graphical user interface or command line application may be initialized at the client devices that allows a user to request the seed/key value assigned to a particular project. In response, the graphical user interface web portal may encrypt the seed/key value and transmit the seed/key value over a secured protocol to de-identification application at source client end device).
Claim 5: Modified Kho discloses the distributed computing system as per claim 1. Kho further discloses wherein the client agent comprises at least one of a cloud-based server in a virtual private cloud, an on-site provisioned virtual machine, or an on-site server with access to data in a network and compute processing devices including one or more of CPUs or GPUs (Paragraph [0033]; [0039]; Fig. 5, an example of a suitable computing and networking environment that may be used to implement various aspects of the present disclosure. The computing and networking environment includes a general-purpose computing device, the networking environment may include one or more other computing systems, such as personal computers, server computers, hand-help or laptop devices, and the like. The computer may operate in a networked or cloud-computing environment using logical connections of a network interface or adapter to one or more remote devices. The remote computer may be a personal computer, a server, a router, a network PC, and typically includes many or all of the elements relative described relative to the computer).
Claim 6: Modified Kho discloses the distributed computing system as per claim 1. Kho further discloses wherein receiving the dataset comprises receiving at least one of a tabular dataset, imaging data, file data, video data, HER data, graph data, or streamed data (Paragraph [0014]; [0037] the DALMS represents the various computing systems, services, applications, and/or processes that may be deployed at a healthcare enterprise, and thus may include large data sets of confidential and/or sensitive patient health data (e.g. PHI and/or PII). For example, the DALMS may include MHI and/or PII information for a plurality of medical patients, or other confidential patient health care data, which includes any information in the medical record or designated record set that can be used to identify an individual).
Claim 7: Modified Kho discloses the distributed computing system as per claim 1. Kho further discloses wherein the process is further operable to: receive encrypted code and a code key from the central server; decrypt the encrypted code with the received code key; and execute the decrypted code (Paragraph [0017]; [0021-0022]; Fig. 2, in some embodiments the security domain represents separate sets and/or networks of computing components and cannot be accessed without proper authentication and/or encryption/decryption. In response the graphical user interface web portal may encrypt the seed/key value and transmit the seed/key value over a secured protocol to de-identification application as source/client end device. For example and in one embodiment, the seed value and/or key value may be encrypted using secure hash algorithms. The encrypted seed value and/or key value is securely transmitted to the DALMS. The seed/key value is decrypted and used to process patient health data. The DALMS processes the seed value in combination with other PHI/PII elements to generate hashes via cryptographic algorithms).
Claim 8: Modified Kho discloses the distributed computing system as per claim 7. Kho further discloses wherein receiving the encrypted code comprises receiving at least one of encrypted model code or an encrypted container (Paragraph [0017]; [0021-0022]; Fig. 2, in some embodiments the security domain represents separate sets and/or networks of computing components and cannot be accessed without proper authentication and/or encryption/decryption. In response the graphical user interface web portal may encrypt the seed/key value and transmit the seed/key value over a secured protocol to de-identification application as source/client end device. For example and in one embodiment, the seed value and/or key value may be encrypted using secure hash algorithms. The encrypted seed value and/or key value is securely transmitted to the DALMS. The seed/key value is decrypted and used to process patient health data. The DALMS processes the seed value in combination with other PHI/PII elements to generate hashes via cryptographic algorithms).
Claim 9: Modified Kho discloses the distributed computing system as per claim 7. Kho further discloses wherein executing the decrypted code comprises executing the decrypted code on at least one of a central processing unit (CPU) or a graphics processing unit (GPU) or in a Trusted Execution Environment (Paragraph [0017]; [0019]; [0021-0022]; Fig. 2, in some embodiments the security domain represents separate sets and/or networks of computing components and cannot be accessed without proper authentication and/or encryption/decryption. In response the graphical user interface web portal may encrypt the seed/key value and transmit the seed/key value over a secured protocol to de-identification application as source/client end device. For example and in one embodiment, the seed value and/or key value may be encrypted using secure hash algorithms. The encrypted seed value and/or key value is securely transmitted to the DALMS. The seed/key value is decrypted and used to process patient health data. The DALMS processes the seed value in combination with other PHI/PII elements to generate hashes via cryptographic algorithms).
Claim 10: Modified Kho discloses the distributed computing system as per claim 7. Kho further discloses wherein the process is further operable to transmit aggregate output statistics or execution results to the central server (Paragraph [0003]; [0011-0012]; [0014-0015]; [0019-0020]; Fig. 1, aspects of the present disclosure include methods, systems, and computer readable mediums for dynamically de-identifying data from a data source. The methods include a remote device deployed in a first security domain of a communication network, the remote device for generating a key value corresponding to a project requiring de-identification of data. The methods further involve a first server located in a second security domain of a communication network. The first server is configured to: securely receive the key value from the remote device and in response to receiving the key value: access a dataset maintained at the first server, the dataset including at least one individual record that uniquely identifies an individual. The first server is further configured to de-identify the dataset so that the at least one individual record in the dataset no longer uniquely identifies the individual. One or more client devices may securely transmit confidential patient health data (e.g. PHI and/or PII) to the de-identification and anonymous linkage management system (DALMS). For example, if a user were interested in de-identifying data for a particular project a user may interact with the one or more client devices to provide data including a seed value and/or key value that identifies the relevant project, and which may be processed by the de-identification module. In some embodiments, a graphical user interface may be generated and/or initialized at the one or more client devices that may be employed to deliver an encrypted, project-based, seed/key value to DALMS. For example, a graphical user interface or command line application may be initialized at the client devices that allows a user to request the seed/key value assigned to a particular project. In response, the graphical user interface web portal may encrypt the seed/key value and transmit the seed/key value over a secured protocol to de-identification application at source client end device).
Therefore, claims 1-10 are rejected under U.S.C. 103.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure:
Rose (US 2018/0173893) Smart de-identification using date jittering.
Eigner (US 2017/0277775) Systems and methods for secure storage of user information in user profile.
Mitchell (US 2009/0046856) Methods and apparatus for encrypting, obfuscating, and reconstructing datasets or objects.
Suppan (US 2023/0045533) Method and system for providing anonymized patient datasets.
Aravamudan (US 2023/0044294) Systems and methods for computing with private healthcare data.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to COREY RUSS whose telephone number is (571)270-5902. The examiner can normally be reached on M-F 7:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Lynda Jasmin can be reached on 5712726782. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/COREY RUSS/Examiner, Art Unit 3629