DETAILED ACTION
Continued Examination Under 37 CFR 1.114
A request for continued examination (RCE) under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 3/2/2026 has been entered.
Summary and Status of Claims
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action is in response to Applicant’s RCE filed 3/2/2026.
Claims 25-44 are pending.
Claims 25, 26, 30-36, and 39-44 are rejected under 35 U.S.C. 103 as being unpatentable over Ransil et al. (US Patent 7,801,912) of record, in view of Vermeulen et al. (US Patent 7,716,180) of record, further in view of Gammaraju et al. (US Patent Pub 2015/0120791).
Claims 27-29, 37, and 38 are rejected under 35 U.S.C. 103 as being unpatentable over Ransil et al. (US Patent 7,801,912) of record, in view of Vermeulen et al. (US Patent 7,716,180) of record and Gammaraju et al. (US Patent Pub 2015/0120791), further in view of Gross et al. (US Patent Pub 2009/0144388).
Claims 25, 35, and 42 are rejected under 35 U.S.C. 103 as being unpatentable over Amazon (“Amazon Elastic MapReduce – Developer Guide”, 3/31/2009), in view of Gammaraju et al. (US Patent Pub 2015/0120791).
Claims 26, 30-34, 36, 39-41, 43, and 44 are rejected under 35 U.S.C. 103 as being unpatentable over Amazon (“Amazon Elastic MapReduce – Developer Guide”, 3/31/2009), in view of Gammaraju et al. (US Patent Pub 2015/0120791), further in view of Vermeulen et al. (US Patent 7,716,180).
Claims 27-29, 37, and 38 are rejected under 35 U.S.C. 103 as being unpatentable over Amazon (“Amazon Elastic MapReduce – Developer Guide”, 3/31/2009), in view of Gammaraju et al. (US Patent Pub 2015/0120791), further in view of Gross et al. (US Patent Pub 2009/0144388).
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Note on Prior Art Rejections
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 25, 26, 30-36, and 39-44 are rejected under 35 U.S.C. 103 as being unpatentable over Ransil et al. (US Patent 7,801,912) (Ransil) of record, in view of Vermeulen et al. (US Patent 7,716,180) (Vermeulen) of record, further in view of Gammaraju et al. (US Patent Pub 2015/0120791) (Gammaraju).
In regards to claim 25, Ransil discloses a method, comprising:
receiving, via a distributed computing service of a network-accessible service provider, configuration input from a client for provisioning a distributed computing system (Ransil at Fig. 4; col. 5, lines 1-10; col. 29, lines 15-55; col. 33, lines 40-43; col. 65, lines 32-64; col. 66, lines 30-35; col. 67, line 7)1;
receiving, via the distributed computing service, a request from the client to provision the distributed computing system (Ransil at Fig. 4; col. 5, lines 1-10; col. 29, lines 15-55; col. 33, lines 40-43; col. 65, lines 32-64; col. 66, lines 30-35; col. 67, line 7)2;
providing, responsive to the request to provision, the distributed computing system according to the configuration input (Ransil at col. 65, lines 32-58)3, including:
provisioning, by the distributed computing service and responsive to the request to provision, one or more compute nodes for the distributed computing system, wherein provisioning comprises creating, allocating, and setting up the one or more compute nodes (Ransil at col. 3, lines 21-27; col. 65, lines 32-64; col. 66, lines 30-35; col. 67, line 7)4;
provisioning, by the distributed computing service and responsive to the request to provision, the distributed computing file system (DCFS) for the distributed computing system via an object storage service, of the network-accessible service provider, (Ransil at col. 3, lines 21-27; col. 5, lines 1-7; col. 8, lines 5-7)5 that implements a first client facing interface, wherein provisioning comprises creating, allocating, and setting up the DCFS (Ransil at col. 5, lines 1-10; col. 9, lines 46-59; col. 65, lines 32-64; col. 66, lines 30-35; col. 67, line 7)6;
provisioning, by the distributed computing service and responsive to the request to provision, a DCFS directory for the distributed computing system via a database service, of the network-accessible service provider, (Ransil at col. 5, lines 1-7; col. 7, lines 4-7)7 that implements a second client facing interface, wherein provisioning comprises creating, allocating, and setting up the DCFS directory (Ransil at Fig. 2; col. 9, lines 46-59; col. 65, lines 32-64; col. 66, lines 30-35; col. 67, line 7)8; and
for a file in the object storage service to be processed (Ransil at col. 7, lines 55-67; col. 8, lines 1-4; col. 20, lines 5-19, 39-61)9,
accessing the file system metadata from the database service (Ransil at col. 20, lines 39-44)10,
accessing the file from the object storage service according to the file system metadata accessed via the database service. Ransil at col. 20, lines 39-61; col. 22, lines 10-16.11
Ransil does not expressly disclose wherein the DCFS stores data objects as files in a file directory structure and the DCFS directory stores the file directory structure of the DCFS.
Vermeulen discloses storing data objects, which can be files, in buckets that are analogous to a file system directory or folder (i.e., stores data objects as files in a file directory structure). Vermeulen at col. 6, lines 6-11; col. 23, lines 31-50. Vermeulen further discloses a DCFS directory that stores the file directory structure of the DCFS using a data object storage space managed by a file system. Vermeulen at col. 7, lines 49-60; col. 23, lines 31-50.
Ransil and Vermeulen are analogous art because they are both directed to the same field of endeavor of distributed storage systems.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Ransil by adding the features of wherein the DCFS stores data objects as files in a file directory structure and the DCFS directory stores the file directory structure of the DCFS, as disclosed by Vermeulen.
The motivation for doing so would have been because using hierarchical file system with a directory structure is conventional and familiar (Vermeulen at col. 7, lines 49-51), allowing a client the ability to write or access particular directory paths they are familiar with.
Ransil in view of Vermeulen does not expressly disclose that the compute node of the distributed computing system, provisioned by the distributed computing service of the network-accessible service provider, performs the accessing of file system metadata from the database service, the accessing the file from the object storage service according to the file system metadata accessed via the database service, modifying the accessed file, and writing the modified file to the object storage service of the network-accessible service provider according to the file system metadata accessed via the database service of the network-accessible service provider.
Gammaraju discloses a system and method for a multi-tenant implementation of a distributed file system, such as, Hadoop file system (HDFS). Gammaraju at para. 0011. As set forth in the rejection above, Ransil discloses provisioning of a distributed computing system based on client configuration input. Much like Ransil, Gammaraju discloses receiving a request to instantiate and deploy a distributed file system comprising a plurality of hosts, which comprise a name node (i.e., DCFS directory), compute nodes, and a data nodes (i.e., object storage). Gammaraju at Fig. 1A, 6; paras. 0021, 0048-54. Gammaraju further discloses that compute nodes (also referred to as compute VMs) carry out tasks from clients, including accessing data VMs (i.e., object storage) to perform reading and writing of data blocks during an execution of a job (i.e., accessing the file system metadata and accessing the file from the object storage). A job can include processing files in the filesystem (i.e., modifying a file) and storing the output of the job in a directory (i.e., modifying a file and writing the modified file to the object storage service). Gammaraju at paras. 0021, 0039-40. The data nodes and name node are all part of the network-accessible service provider which provides the distributed computing system because the distributing computing service which implements the distributed computing system is of the network-accessible service provider.
Ransil, Vermeulen, and Gammaraju are analogous art because they are directed to the same field of endeavor of distributed computing systems.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Ransil in view of Vermeulen by adding the features of the compute node of the distributed computing system, provisioned by the distributed computing service of the network-accessible service provider, performs the accessing of file system metadata from the database service, the accessing the file from the object storage service according to the file system metadata accessed via the database service, modifying the accessed file, and writing the modified file to the object storage service of the network-accessible service provider according to the file system metadata accessed via the database service of the network-accessible service provider, as disclosed by Gammaraju.
The motivation for doing so would have been because in a HDFS, data nodes and compute nodes can be separated to allow for the compute nodes to be elastically scaled based on the needs of a distributed application. Gammaraju at para. 0011. Moreover, the use of compute nodes to perform a job requested by a client allows for carrying out tasks in parallel. Gammaraju at paras. 0019-20.
In regards to claim 26, Ransil in view of Gammaraju discloses the method of claim 25, wherein:
a. the object storage service is not guaranteed to return a latest version of the data objects updated via the first client-facing interface. Ransil at col. 25, lines 17-29; col. 40, lines 37-39.12
Ransil in view of Gammaraju does not expressly disclose the database the database service is guaranteed to return a latest version of the metadata of the data objects updated via the second client-facing interface. Ransil does disclose the service attempts to provide as close to up to date information as possible. Ransil at col. 245, lines 38-44.
Vermeulen discloses a keymap (i.e., DCFS directory) in a distributed storage system that is used to determine the location of data objects in response to client requests. The keymap is updated atomically in a strictly synchronous fashion, which guarantees that changes made to the data objects are immediately reflected across the system, which guarantees the latest version of the metadata is returned. Vermeulen at col. 6, lines 40-44; col. 13, lines 34-39; col. 35, lines 27-33; col. 40, lines 6-9.
Ransil, Gammaraju, and Vermeulen are analogous art because they are both directed to the same field of endeavor of distributed storage systems.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Ransil in view of Gammaraju by adding the feature of the database service is guaranteed to return a latest version of the metadata of the data objects updated via the second client facing interface, as disclosed by Vermeulen.
The motivation for doing so would have been because Ransil in view of Gammaraju already discloses the attempt to provide as close to up to date information as possible, as discussed above. Modifying Ransil in view of Gammaraju to ensure the searchable index is updated in strictly synchronous fashion would guarantee that the information is always up to date.
In regards to claim 30, Ransil in view of Gammaraju discloses the method of claim 25, but does not expressly disclose wherein a synchronization job is executed by an agent implemented in the distributed computing system and comprises:
a. checking for inconsistencies between the DCFS and the DCFS directory; and
b. responsive to discovery of an inconsistency between the DCFS and DCFS directory, resolving the inconsistency.
Ransil in view of Gammaraju does disclose a periodic garbage collection mechanism (i.e., agent) that deletes metadata that has been marked to be deleted, thereby making the searchable index consistent with the data store. Ransil at col. 15, lines 45-54.
Vermeulen discloses determining when the keymap has not been properly updated and executing reconciliation procedures to correct the issue. This determination is performed periodically. Vermeulen at col. 22, lines 5-31, 65-67; col. 23, lines 1-10; col. 42, lines 19-40. Vermeulen discloses, in the case where one keymap instance is out of sync with the rest of the keymap instances, a divergence and inconsistency is created causing clients to access out of date or different versions of object data. Vermeulen at col. 40, lines 23-27. Thus, the synchronization protocol performed by the system of Vermeulen is determining inconsistencies between keymap instances, which is determining inconsistencies between the object storage and the keymap as a whole, and resolving the inconsistencies by synchronizing the keymap instances to a consistent state. Vermeulen at col. 42, lines 19-40. In this way, the keymap of the system is synchronized with the object storage of the system (i.e., synchronizing the DCFS with the DCFS directory).
At the time before the effective filing date of the instant application, it would have been obvious one of ordinary skill in the art to modify Ransil in view of Gammaraju by adding the features of checking for inconsistencies between the DCFS and the DCFS directory and responsive to discovery of an inconsistency between the DCFS and DCFS directory, resolving the inconsistency, as disclosed by Vermeulen.
The motivation for doing so would have been to ensure that the data and its index are consistent, which both Ransil in view of Gammaraju and Vermeulen desire. Ransil at col. 8, lines 5-7. Vermeulen at col. 40, lines 38-32.
In regards to claim 31, Ransil in view of Vermeulen and Gammaraju discloses the method of claim 30, wherein resolving the inconsistency comprises correcting one or more entries in the metadata stored in the DCFS directory. Vermeulen at col. 42, lines 19-40.
In regards to claim 32, Ransil in view of Vermeulen and Gammaraju discloses the method of claim 30, wherein the checking is performed periodically as a background process. Vermeulen at col. 23, lines 1-5.13
In regards to claim 33, Ransil in view of Vermeulen and Gammaraju discloses the method of claim 30, wherein the checking is performed in response to a command of the client. Ransil at col. 65, lines 51-58.14
In regards to claim 34, Ransil in view of Vermeulen and Gammaraju discloses the method of claim 30, further comprising provisioning a resource instance for the distributed computing system to implement the agent. Vermeulen at col. 59, lines 31-35.
In regards to claim 35, Ransil discloses a system, comprising:
one or more computing systems comprising one or more processors and memory that implement a network-accessible service (Ransil at Fig. 4; col. 29, lines 15-55) configured to:
receive, via a distributed computing service of a network accessible service provider, configuration input from a client for provisioning and configuring a distributed computing system (Ransil at Fig. 4; col. 5, lines 1-10; col. 29, lines 15-55; col. 33, lines 40-43; col. 65, lines 32-64; col. 66, lines 30-35; col. 67, line 7)15;
receiving, via the distributed computing service, a request from the client to provision the distributed computing system (Ransil at Fig. 4; col. 5, lines 1-10; col. 29, lines 15-55; col. 33, lines 40-43; col. 65, lines 32-64; col. 66, lines 30-35; col. 67, line 7)16;
providing, responsive to the request to provision, the distributed computing system according to the configuration input (Ransil at col. 65, lines 32-58)17, including:
provisioning, by the distributed computing service and responsive to the request to provision, one or more compute nodes for the distributed computing system, wherein provisioning comprises creating, allocating, and setting up the one or more compute nodes (Ransil at col. 3, lines 21-27; col. 65, lines 32-64; col. 66, lines 30-35; col. 67, line 7)18;
provisioning, by the distributed computing service and responsive to the request to provision, the distributed computing file system (DCFS) for the distributed computing system via an object storage service, of the network-accessible service provider, (Ransil at col. 3, lines 21-27; col. 5, lines 1-7; col. 8, lines 5-7)19 that implements a first client facing interface, wherein provisioning comprises creating, allocating, and setting up the DCFS (Ransil at col. 5, lines 1-10; col. 9, lines 46-59; col. 65, lines 32-64; col. 66, lines 30-35; col. 67, line 7)20;
provisioning, by the distributed computing service and responsive to the request to provision, a DCFS directory for the distributed computing system via a database service, of the network-accessible service provider, (Ransil at col. 5, lines 1-7; col. 7, lines 4-7)21 that implements a second client facing interface, wherein provisioning comprises creating, allocating, and setting up the DCFS directory (Ransil at Fig. 2; col. 9, lines 46-59; col. 65, lines 32-64; col. 66, lines 30-35; col. 67, line 7)22; and
for a file in the object storage service to be processed (Ransil at col. 7, lines 55-67; col. 8, lines 1-4; col. 20, lines 5-19, 39-61)23,
accessing the file system metadata from the database service (Ransil at col. 20, lines 39-44)24,
accessing the file from the object storage service according to the file system metadata accessed via the database service. Ransil at col. 20, lines 39-61; col. 22, lines 10-16.25
Ransil does not expressly disclose wherein the DCFS stores data objects as files in a file directory structure and the DCFS directory stores the file directory structure of the DCFS.
Vermeulen discloses storing data objects, which can be files, in buckets that are analogous to a file system directory or folder (i.e., stores data objects as files in a file directory structure). Vermeulen at col. 6, lines 6-11; col. 23, lines 31-50. Vermeulen further discloses a DCFS directory that stores the file directory structure of the DCFS using a data object storage space managed by a file system. Vermeulen at col. 7, lines 49-60; col. 23, lines 31-50.
Ransil and Vermeulen are analogous art because they are both directed to the same field of endeavor of distributed storage systems.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Ransil by adding the features of wherein the DCFS stores data objects as files in a file directory structure and the DCFS directory stores the file directory structure of the DCFS, as disclosed by Vermeulen.
The motivation for doing so would have been because using hierarchical file system with a directory structure is conventional and familiar (Vermeulen at col. 7, lines 49-51), allowing a client the ability to write or access particular directory paths they are familiar with.
Ransil in view of Vermeulen does not expressly disclose that the compute node of the distributed computing system, provisioned by the distributed computing service of the network-accessible service provider, performs the accessing of file system metadata from the database service, the accessing the file from the object storage service according to the file system metadata accessed via the database service, modifying the accessed file, and writing the modified file to the object storage service of the network-accessible service provider according to the file system metadata accessed via the database service of the network-accessible service provider.
Gammaraju discloses a system and method for a multi-tenant implementation of a distributed file system, such as, Hadoop file system (HDFS). Gammaraju at para. 0011. As set forth in the rejection above, Ransil discloses provisioning of a distributed computing system based on client configuration input. Much like Ransil, Gammaraju discloses receiving a request to instantiate and deploy a distributed file system comprising a plurality of hosts, which comprise a name node (i.e., DCFS directory), compute nodes, and a data nodes (i.e., object storage). Gammaraju at Fig. 1A, 6; paras. 0021, 0048-54. Gammaraju further discloses that compute nodes (also referred to as compute VMs) carry out tasks from clients, including accessing data VMs (i.e., object storage) to perform reading and writing of data blocks during an execution of a job (i.e., accessing the file system metadata and accessing the file from the object storage). A job can include processing files in the filesystem (i.e., modifying a file) and storing the output of the job in a directory (i.e., modifying a file and writing the modified file to the object storage service). Gammaraju at paras. 0021, 0039-40. The data nodes and name node are all part of the network-accessible service provider which provides the distributed computing system because the distributing computing service which implements the distributed computing system is of the network-accessible service provider.
Ransil, Vermeulen, and Gammaraju are analogous art because they are directed to the same field of endeavor of distributed computing systems.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Ransil in view of Vermeulen by adding the features of the compute node of the distributed computing system, provisioned by the distributed computing service of the network-accessible service provider, performs the accessing of file system metadata from the database service, the accessing the file from the object storage service according to the file system metadata accessed via the database service, modifying the accessed file, and writing the modified file to the object storage service of the network-accessible service provider according to the file system metadata accessed via the database service of the network-accessible service provider, as disclosed by Gammaraju.
The motivation for doing so would have been because in a HDFS, data nodes and compute nodes can be separated to allow for the compute nodes to be elastically scaled based on the needs of a distributed application. Gammaraju at para. 0011. Moreover, the use of compute nodes to perform a job requested by a client allows for carrying out tasks in parallel. Gammaraju at paras. 0019-20.
Claims 36 and 39-41 are essentially the same as claims 26 and 30-32, respectively, in the form of a system. Therefore, they are rejected for the same reasons.
In regards to claim 42, Ransil discloses one or more non-transitory computer-readable media having stored program instructions (Ransil at col. 72, lines 43-47) than when executed on or across one or more processors of one or more computing system of a network-accessible service (Ransil at Fig. 4; col. 29, lines 15-55) cause the network-accessible service to:
receive, via a distributed computing service of network accessible service provider, configuration input from a client for provisioning and configuring a distributed computing system (Ransil at col. 65, lines 32-58)26;
receive, via the distributed computing service, a request from the client to provision the distributed computing system (Ransil at Fig. 4; col. 5, lines 1-10; col. 29, lines 15-55; col. 33, lines 40-43; col. 65, lines 32-64; col. 66, lines 30-35; col. 67, line 7)27;
provide, responsive to the request to provision, the distributed computing system according to the configuration input (Ransil at col. 65, lines 32-58)28, including:
provision, by the distributed computing service and responsive to the request to provision, one or more compute nodes for the distributed computing system, wherein provisioning comprises creating, allocating, and setting up the one or more compute nodes (Ransil at col. 3, lines 21-27; col. 65, lines 32-64; col. 66, lines 30-35; col. 67, line 7)29;
provision, by the distributed computing service and responsive to the request to provision, the distributed computing file system (DCFS) for the distributed computing system via an object storage service, of the network-accessible service provider, (Ransil at col. 3, lines 21-27; col. 5, lines 1-7; col. 8, lines 5-7)30 that implements a first client facing interface, wherein provisioning comprises creating, allocating, and setting up the DCFS (Ransil at col. 5, lines 1-10; col. 9, lines 46-59; col. 65, lines 32-64; col. 66, lines 30-35; col. 67, line 7)31;
provision, by the distributed computing service and responsive to the request to provision, a DCFS directory for the distributed computing system via a database service, of the network-accessible service provider, (Ransil at col. 5, lines 1-7; col. 7, lines 4-7)32 that implements a second client facing interface, wherein provisioning comprises creating, allocating, and setting up the DCFS directory (Ransil at Fig. 2; col. 9, lines 46-59; col. 65, lines 32-64; col. 66, lines 30-35; col. 67, line 7)33; and
for a file in the object storage service to be processed (Ransil at col. 7, lines 55-67; col. 8, lines 1-4; col. 20, lines 5-19, 39-61)34,
access the file system metadata from the database service (Ransil at col. 20, lines 39-44)35,
access the file from the object storage service according to the file system metadata accessed via the database service. Ransil at col. 20, lines 39-61; col. 22, lines 10-16.36
Ransil does not expressly disclose wherein the DCFS stores data objects as files in a file directory structure and the DCFS directory stores the file directory structure of the DCFS.
Vermeulen discloses storing data objects, which can be files, in buckets that are analogous to a file system directory or folder (i.e., stores data objects as files in a file directory structure). Vermeulen at col. 6, lines 6-11; col. 23, lines 31-50. Vermeulen further discloses a DCFS directory that stores the file directory structure of the DCFS using a data object storage space managed by a file system. Vermeulen at col. 7, lines 49-60; col. 23, lines 31-50.
Ransil and Vermeulen are analogous art because they are both directed to the same field of endeavor of distributed storage systems.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Ransil by adding the features of wherein the DCFS stores data objects as files in a file directory structure and the DCFS directory stores the file directory structure of the DCFS, as disclosed by Vermeulen.
The motivation for doing so would have been because using hierarchical file system with a directory structure is conventional and familiar (Vermeulen at col. 7, lines 49-51), allowing a client the ability to write or access particular directory paths they are familiar with.
Ransil in view of Vermeulen does not expressly disclose that the compute node of the distributed computing system, provisioned by the distributed computing service of the network-accessible service provider, performs the accessing of file system metadata from the database service, the accessing the file from the object storage service according to the file system metadata accessed via the database service, modifying the accessed file, and writing the modified file to the object storage service of the network-accessible service provider according to the file system metadata accessed via the database service of the network-accessible service provider.
Gammaraju discloses a system and method for a multi-tenant implementation of a distributed file system, such as, Hadoop file system (HDFS). Gammaraju at para. 0011. As set forth in the rejection above, Ransil discloses provisioning of a distributed computing system based on client configuration input. Much like Ransil, Gammaraju discloses receiving a request to instantiate and deploy a distributed file system comprising a plurality of hosts, which comprise a name node (i.e., DCFS directory), compute nodes, and a data nodes (i.e., object storage). Gammaraju at Fig. 1A, 6; paras. 0021, 0048-54. Gammaraju further discloses that compute nodes (also referred to as compute VMs) carry out tasks from clients, including accessing data VMs (i.e., object storage) to perform reading and writing of data blocks during an execution of a job (i.e., accessing the file system metadata and accessing the file from the object storage). A job can include processing files in the filesystem (i.e., modifying a file) and storing the output of the job in a directory (i.e., modifying a file and writing the modified file to the object storage service). Gammaraju at paras. 0021, 0039-40. The data nodes and name node are all part of the network-accessible service provider which provides the distributed computing system because the distributing computing service which implements the distributed computing system is of the network-accessible service provider.
Ransil, Vermeulen, and Gammaraju are analogous art because they are directed to the same field of endeavor of distributed computing systems.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Ransil in view of Vermeulen by adding the features of the compute node of the distributed computing system, provisioned by the distributed computing service of the network-accessible service provider, performs the accessing of file system metadata from the database service, the accessing the file from the object storage service according to the file system metadata accessed via the database service, modifying the accessed file, and writing the modified file to the object storage service of the network-accessible service provider according to the file system metadata accessed via the database service of the network-accessible service provider, as disclosed by Gammaraju.
The motivation for doing so would have been because in a HDFS, data nodes and compute nodes can be separated to allow for the compute nodes to be elastically scaled based on the needs of a distributed application. Gammaraju at para. 0011. Moreover, the use of compute nodes to perform a job requested by a client allows for carrying out tasks in parallel. Gammaraju at paras. 0019-20.
Claims 43 and 44 are essentially the same as claims 30 and 32, respectively, in the form of a computer readable media. Therefore, they are rejected for the same reasons.
Claims 27-29, 37, and 38 are rejected under 35 U.S.C. 103 as being unpatentable over Ransil et al. (US Patent 7,801,912) (Ransil) of record, in view of Vermeulen et al. (US Patent 7,716,180) (Vermeulen) of record and Gammaraju et al. (US Patent Pub 2015/0120791) (Gammaraju), further in view of Gross et al. (US Patent Pub 2009/0144388) (Gross) of record.
In regards to claim 27, Ransil in view of Vermeulen and Gammaraju discloses the method of claim 25, but does not expressly disclose further comprising: for a request to retrieve a data object, checking, by one of the one or more compute nodes, a cache for the data object before accessing the DCFS to retrieve the data object. Ransil does disclose having a query cache that is consulted prior to searching the index for locator information (i.e., metadata) but does not expressly disclose it is also consulted to retrieve a data object. Ransil at col. 32, lines 20-24.
Gross discloses a distributed storage system with a centralized metadata server. Gross further discloses data objects are stored on a cache and checking the cache before querying the central metadata server to retrieve data objects from clustered storage. Gross at paras. 0031, 0039-41.
Ransil, Vermeulen, Gammaraju, and Gross are analogous art because they are directed to the same field of endeavor of distributed storage systems.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Ransil in view of Vermeulen and Gammaraju by adding the feature of for a request to retrieve a data object, checking, by one of the one or more compute nodes, a cache for the data object before accessing the DCFS to retrieve the data object, as disclosed by Gross.
The motivation for doing so would have been to increase efficiency in retrieving data objects. Gross at para. 0041.
In regards to claim 28, Ransil in view of Vermeulen, Gammaraju, and Gross discloses the method of claim 27, wherein the cache is implemented as a local cache of the one compute node. Gross at para. 0025.
In regards to claim 29, Ransil in view of Vermeulen, Gammaraju, and Gross discloses the method of claim 27, further comprising: checking, by the one compute node, the cache for metadata of the data object before accessing the DCFS directory for the metadata of the data object. Ransil at col. 32, lines 20-24.
Claims 37 and 38 are essentially the same as claims 27 and 29, respectively, in the form of a system. Therefore, they are rejected for the same reasons.
Claims 25, 35, and 42 are rejected under 35 U.S.C. 103 as being unpatentable over Amazon (“Amazon Elastic MapReduce – Developer Guide”, 3/31/2009), in view of Gammaraju et al. (US Patent Pub 2015/0120791) (Gammaraju).
In regards to claim 25, Amazon discloses a method, comprising:
receiving, via a distributed computing service of a network-accessible service provider, configuration input from a client for provisioning a distributed computing system (Amazon at pgs. 6-7, 10, 34-134)37;
receiving, via the distributed computing service, a request from the client to provision the distributed computing system (Amazon at pgs. 10, 17-24, 34-134)38;
providing, responsive to the request to provision, the distributed computing system according to the configuration input (Amazon at pgs. 17-24)39, including:
provisioning, by the distributed computing service and responsive to the request to provision, one or more compute nodes for the distributed computing system, wherein provisioning comprises creating, allocating, and setting up the one or more compute nodes (Amazon at pgs. 7, 17-24)40;
provisioning, by the distributed computing service and responsive to the request to provision, a distributed computing file system (DCFS) for the distributed computing system via an object storage service, of the network-accessible service provider, that implements a first client-facing interface, wherein provisioning comprises creating, allocating, and setting up the DCFS, and wherein the DCFS stores data objects as files in a file directory structure (Amazon at pgs. 4, 7, 14-15)41; and
provisioning, by the distributed computing service and responsive to the request to provision, a DCFS directory for the distributed computing system via a database service, of the network-accessible service provider, that implements a second client-facing interface, wherein provisioning comprises creating, allocating, and setting up the DCFS directory, and wherein the DCFS directory stores the file directory structure of the DCFS (Amazon at pgs. 181, 200, 211-218, 284--298)42;
Amazon does not expressly disclose for a file in the object storage service to be processed at a compute node of the one or more compute nodes provisioned by the distributed computing service of the network-accessible service provider: accessing, by the compute node, file system metadata from the database service, accessing, by the compute node, the file from the object storage service according to the file system metadata accessed via the database service, modifying, by the compute node, the accessed file, and writing the modified file to the object storage service of the network- accessible service provider according to the file system metadata accessed via the database service of the network-accessible service provider.
Gammaraju discloses a system and method for a multi-tenant implementation of a distributed file system, such as, Hadoop file system (HDFS). Gammaraju at para. 0011. Gammaraju discloses receiving a request to instantiate and deploy a distributed file system comprising a plurality of hosts, which comprise a name node (i.e., DCFS directory), compute nodes, and a data nodes (i.e., object storage). Gammaraju at Fig. 1A, 6; paras. 0021, 0048-54. Gammaraju further discloses that compute nodes (also referred to as compute VMs) carry out tasks from clients, including accessing data VMs (i.e., object storage) to perform reading and writing of data blocks during an execution of a job (i.e., accessing the file system metadata and accessing the file from the object storage). A job can include processing files in the filesystem (i.e., modifying a file) and storing the output of the job in a directory (i.e., modifying a file and writing the modified file to the object storage service). Gammaraju at paras. 0021, 0039-40.
Amazon and Gammaraju are analogous art because they are directed to the same field of endeavor of implementing a distributed computing system.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Amazon by adding the features of for a file in the object storage service to be processed at a compute node of the one or more compute nodes provisioned by the distributed computing service of the network-accessible service provider: accessing, by the compute node, file system metadata from the database service, accessing, by the compute node, the file from the object storage service according to the file system metadata accessed via the database service, modifying, by the compute node, the accessed file, and writing the modified file to the object storage service of the network- accessible service provider according to the file system metadata accessed via the database service of the network-accessible service provider, as disclosed by Gammaraju.
The motivation for doing so would have been because in a HDFS, data nodes and compute nodes can be separated to allow for the compute nodes to be elastically scaled based on the needs of a distributed application. Gammaraju at para. 0011. Moreover, the use of compute nodes to perform a job requested by a client allows for carrying out tasks in parallel. Gammaraju at paras. 0019-20. As set forth above, Amazon discloses methods and system for provisioning a distributed computing system as claimed but does not go into detail about specific client interactions for modifying files stored in the object storage. Gammaraju discloses servicing such client requests as discussed above.
Claim 35 is essentially the same as claim 25 in the form of a system comprising one or more hardware processors and memory (Amazon at pgs. 2, 5)43. Therefore, it is rejected for the same reasons.
Claim 42 is essentially the same as claim 25 in the form of a non-transitory computer readable medium. Amazon at pgs. 2, 5.44 Therefore, it is rejected for the same reasons.
Claims 26, 30-34, 36, 39-41, 43, and 44 are rejected under 35 U.S.C. 103 as being unpatentable over Amazon (“Amazon Elastic MapReduce – Developer Guide”, 3/31/2009), in view of Gammaraju et al. (US Patent Pub 2015/0120791) (Gammaraju), further in view of Vermeulen et al. (US Patent 7,716,180) (Vermeulen).
In regards to claim 26, Amazon in view of Gammaraju discloses the method of claim 25, wherein:
a. the object storage service is not guaranteed to return a latest version of the data objects updated via the first client-facing interface. Amazon at 388.45
Amazon in view of Gammaraju does not expressly disclose the database the database service is guaranteed to return a latest version of the metadata of the data objects updated via the second client-facing interface.
Vermeulen discloses a keymap (i.e., DCFS directory) in a distributed storage system that is used to determine the location of data objects in response to client requests. The keymap is updated atomically in a strictly synchronous fashion, which guarantees that changes made to the data objects are immediately reflected across the system, which guarantees the latest version of the metadata is returned. Vermeulen at col. 6, lines 40-44; col. 13, lines 34-39; col. 35, lines 27-33; col. 40, lines 6-9.
Amazon, Gammaraju, and Vermeulen are analogous art because they are both directed to the same field of endeavor of distributed storage systems.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Amazon in view of Gammaraju by adding the feature of the database service is guaranteed to return a latest version of the metadata of the data objects updated via the second client facing interface, as disclosed by Vermeulen.
The motivation for doing so would have been because Amazon in view of Gammaraju already discloses the attempt to provide as close to up to date information as possible, as discussed above. Modifying Amazon in view of Gammaraju to ensure the searchable index is updated in strictly synchronous fashion would guarantee that the information is always up to date.
In regards to claim 30, Amazon in view of Gammaraju discloses the method of claim 25, but does not expressly disclose wherein a synchronization job is executed by an agent implemented in the distributed computing system and comprises:
a. checking for inconsistencies between the DCFS and the DCFS directory; and
b. responsive to discovery of an inconsistency between the DCFS and DCFS directory, resolving the inconsistency.
Vermeulen discloses determining when the keymap has not been properly updated and executing reconciliation procedures to correct the issue. This determination is performed periodically. Vermeulen at col. 22, lines 5-31, 65-67; col. 23, lines 1-10; col. 42, lines 19-40. Vermeulen discloses, in the case where one keymap instance is out of sync with the rest of the keymap instances, a divergence and inconsistency is created causing clients to access out of date or different versions of object data. Vermeulen at col. 40, lines 23-27. Thus, the synchronization protocol performed by the system of Vermeulen is determining inconsistencies between keymap instances, which is determining inconsistencies between the object storage and the keymap as a whole, and resolving the inconsistencies by synchronizing the keymap instances to a consistent state. Vermeulen at col. 42, lines 19-40. In this way, the keymap of the system is synchronized with the object storage of the system (i.e., synchronizing the DCFS with the DCFS directory).
At the time before the effective filing date of the instant application, it would have been obvious one of ordinary skill in the art to modify Amazon in view of Gammaraju by adding the features of wherein a synchronization job is executed by an agent implemented in the distributed computing system and comprises: checking for inconsistencies between the DCFS and the DCFS directory and responsive to discovery of an inconsistency between the DCFS and DCFS directory, resolving the inconsistency, as disclosed by Vermeulen.
The motivation for doing so would have been to ensure that the data and its index are consistent. Vermeulen at col. 40, lines 38-32.
In regards to claim 31, Amazon in view of Gammaraju and Vermeulen discloses the method of claim 30, wherein resolving the inconsistency comprises correcting one or more entries in the metadata stored in the DCFS directory. Vermeulen at col. 42, lines 19-40.
In regards to claim 32, Amazon in view of Gammaraju and Vermeulen discloses the method of claim 30, wherein the checking is performed periodically as a background process. Vermeulen at col. 23, lines 1-5.46
In regards to claim 33, Amazon in view of Gammaraju and Vermeulen discloses the method of claim 30, wherein the checking is performed in response to a command of the client. Amazon at pg. 253.47
In regards to claim 34, Amazon in view of Gammaraju and Vermeulen discloses the method of claim 30, further comprising provisioning a resource instance for the distributed computing system to implement the agent. Vermeulen at col. 59, lines 31-35.
Claims 36 and 39-41 are essentially the same as claims 26 and 30-32, respectively, in the form of a system. Therefore, they are rejected for the same reasons.
Claims 43 and 44 are essentially the same as claims 30 and 32, respectively, in the form of a computer readable media. Therefore, they are rejected for the same reasons.
Claims 27-29, 37, and 38 are rejected under 35 U.S.C. 103 as being unpatentable over Amazon (“Amazon Elastic MapReduce – Developer Guide”, 3/31/2009), in view of Gammaraju et al. (US Patent Pub 2015/0120791) (Gammaraju), further in view of Gross et al. (US Patent Pub 2009/0144388) (Gross).
In regards to claim 27, Amazon in view of Gammaraju discloses the method of claim 25, but does not expressly disclose further comprising: for a request to retrieve a data object, checking, by one of the one or more compute nodes, a cache for the data object before accessing the DCFS to retrieve the data object.
Gross discloses a distributed storage system with a centralized metadata server. Gross further discloses data objects are stored on a cache and checking the cache before querying the central metadata server to retrieve data objects from clustered storage. Gross at paras. 0031, 0039-41.
Amazon, Gammaraju, and Gross are analogous art because they are directed to the same field of endeavor of distributed storage systems.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Amazon in view of Gammaraju by adding the feature of for a request to retrieve a data object, checking, by one of the one or more compute nodes, a cache for the data object before accessing the DCFS to retrieve the data object, as disclosed by Gross.
The motivation for doing so would have been to increase efficiency in retrieving data objects. Gross at para. 0041.
In regards to claim 28, Amazon in view of Gammaraju and Gross discloses the method of claim 27, wherein the cache is implemented as a local cache of the one compute node. Gross at para. 0025.
In regards to claim 29, Amazon in view of Gammaraju and Gross discloses the method of claim 27, further comprising: checking, by the one compute node, the cache for metadata of the data object before accessing the DCFS directory for the metadata of the data object. Amazon at pg. 97-98.48
Claims 37 and 38 are essentially the same as claims 27 and 29, respectively, in the form of a system. Therefore, they are rejected for the same reasons.
Response to Arguments
Rejection of claims 25, 26, 30-36, and 39-44 under 35 U.S.C. 103
Applicant’s arguments in regards to the rejections to claims 25, 26, 30-36, and 39-44 under 35 U.S.C. 103, have been fully considered and they are not persuasive.
In regards to claim 25, Applicant alleges the cited prior art fails to disclose (1) “provisioning, by the distributed computing service and responsive to the request to provision, a distributed computing file system (DCFS) for the distributed computing system via an object storage service, of the network accessible service provider, that implements a first client-facing interface, wherein provisioning comprises creating, allocating, and setting up the DCFS” and (2) “for a file in the object storage service to be processed at a compute node of the one or more compute nodes provisioned by the distributed computing service of the network-accessible service provider, accessing, by the compute node, file system metadata from the database service, accessing, by the compute node, the file from the object storage service according to the file system metadata accessed via the database service, modifying, by the compute node, the accessed file, and writing the modified file to the object storage service of the network-accessible service provider according to the file system metadata accessed via the database service of the network-accessible service provider.”
Examiner is required to give claims their broadest reasonable interpretation in light of the specification. However, limitations from the specification are not read into the claims. MPEP 2111.
In regards to limitation (1), Applicant argues the “searchable index” of Ransil is not implemented via a database service and instead only describes a searchable index to data stored in databases. Remarks at 11. Examiner respectfully disagrees. Ransil discloses that the searchable data service that implements the searchable index, may use a key-value pair storage to store the locators. These locators are retrieved from the searchable index in response to a user query and then used to access data entities stored in the data store. Ransil at col. 6, lines 36-41, 55-64. Ransil further discloses the searchable data service uses a key-value pair storage to store the locators in the searchable index and that the key-value pair storage can be implemented according to an associative dictionary database architecture, such as Berkeley Database. Ransil at col. 8, lines 61-67; col. 9, lines 1-18. For at least these reasons, the “searchable index” of Ransil is implemented via a database service.
In regards to limitation (2), Applicant seems to argue Gammaraju does not disclose the limitation because Gammaraju accesses a name node that resides on the distributed file system while the claim requires accessing two different sources of data implemented by two different types of service provider services. Remarks at 12-13. Examiner respectfully disagrees. First, in response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references. See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). The rejection is based on the combination of Ransil, Vermeulen, and Gammaraju. As set forth in the rejection and the explanation of limitation (1) above, Ransil discloses a database service that implements a searchable index (i.e., DCFS directory) that is searched to find location information for data objects, which is used to access the data objects. What is not expressly disclosed by Ransil is that these tasks are performed by a compute node. Gammaraju discloses compute nodes, which are worker nodes, that are intended to carry out tasks associated with the distributed computing application. Such tasks include client requests to access objects stored on the data nodes for reading and writing. Gammaraju at paras. 0020-21. As shown in Fig. 1A, compute nodes are distinct entities from data nodes. Gammaraju at Fig. 1A. Accordingly, the combination of Ransil, Vermeulen, and Gammaraju results in a distributed computing system that implements an object storage service storing data objects (i.e., the data store in Ransil) and a directory storing metadata about the file directory structure of the storage service (i.e., searchable index implemented via a key-value database service in Ransil) and one or more compute nodes, which can access both to perform client requests, such as reading and writing data objects. For at least these reasons, Ransil in view of Vermuelen and Gammaraju discloses limitation (2).
Applicant further argues that the combination of cited art would be improper because it would improperly “change the principle of operation of the primary reference” to change the coordination functionality of Ransil with the compute nodes of Gammaraju. Remarks at 13. Examiner respectfully disagrees. Much like Ransil, Gammaraju also discloses a coordinator type node, which accepts job requests from clients and routes the task to appropriate nodes for completion. This node is called an application workload scheduler while nodes that complete the tasks are called compute nodes. Gammaraju at Fig. 1A. Therefore, the combination results in the addition of compute nodes, which are separate from the data nodes storing data and metadata, to provide the benefit of the ability to elastically scale based on the needs of the distributed computing application. Gammaraju at para. 0011. Therefore, contrary to Applicant’s arguments, the combination is proper.
Applicant further argues Ransil does not disclose (1) “receiving, via a distributed computing service of a network-accessible service provider, configuration input from a client for provisioning a distributed computing system,” (2) “receiving, via the distributed computing service, a request from the client to provision the distributed computing system,” and (3) “providing, responsive to the request to provision, the distributed computing system according to the configuration input, including: provisioning, by the distributed computing service and responsive to the request to provision, one or more compute nodes for the distributed computing system, wherein provisioning comprises creating, allocating, and setting up the one or more compute nodes.” Remarks at 13-14. Applicant argues Ransil discloses an already existing system or components that may be used to implement the system but does not expressly disclose that the provisioning is performed by a distributed computing service. Remarks at 14. Examiner respectfully disagrees. Applicant has not explained why the system in Ransil is not a “distributed computing service.” Ransil discloses a distributed system with a web services front end and various nodes to perform various functions. Ransil at col. 7, lines 55-58. Ransil discloses an admin console that allows an administrator to perform tasks such as adding or removing resources from the system (i.e., requesting provisioning; the distributed computing service performs the provisioning). Ransil at col. 65, lines 32-64; col. 66, lines 30-35; col. 67, line 7. Thus, Ransil discloses receiving requests to provision resources (i.e., nodes) and the distributed computing system provisions the resources. In response to Applicant’s argument that the system in Ransil is “already-existing,” the claim does not require that the provisioning of any of the components are to form a new, distinct system. Therefore, the broadest reasonable interpretation includes requests that add components to an existing system.
Applicant does not present arguments for the remaining limitations in the claim. Therefore, Examiner asserts Ransil in view of Vermeulen and Gammaraju discloses all the limitations of claim 25. In regards to the remaining claims, Applicant does not present separate arguments. Therefore, they remain rejected for at least the same reasons.
Rejection of claims 27-29, 37, and 38 under 35 U.S.C. 103
Applicant does not present additional arguments in regards to the rejections to claims 27-29, 37, and 38 under 35 U.S.C. 103. Consequently, they remain rejected for at least the same reasons explained above.
Additional Prior Art
Additional relevant prior art are listed on the attached PTO-892 form. Some examples are:
Becker-Szendy et al. (US Patent 7,243,089) discloses a system and method for federating a local file system into a distributed file system.
Aron et al. (US Patent 9,697,227) discloses a system and method for concurrent access in a distributed file system.
Slik et al. (US Patent 7,546,486) discloses a scalable distributed object management storage system.
Barrall et al. (US Patent 7,457,822) discloses a system and method for a hardware based file system.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Examiner Michael Le whose telephone number is 571-272-7970 and fax number is 571-273-7970. The examiner can normally be reached Mon-Fri 9:30 AM – 6 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tony Mahmoudi can be reached on 571-272-4078. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MICHAEL LE/Examiner, Art Unit 2163
/TONY MAHMOUDI/Supervisory Patent Examiner, Art Unit 2163
1 The system provides a distributed data storage and a database that stores a searchable index. Admin console permits an administrator to configure how the searchable data service is provided. The admin console may be implemented in a distributed architecture with remote consoles on a host system (i.e., via a distributed computing service of a network accessible service provider).
2 By using the admin console, the administrator can change the system, such as adding or removing resources (i.e., receiving a request …).
3 The system is configured by the admin through the console for performing the functions of the system.
4 The searchable data service is implemented on a distributed system comprising various nodes (i.e., one or more compute nodes). With the addition of new resources, they are created, allocated (in some manner, for example an IP address), and set up (e.g., with an IP address).
5 The distributed system provides storage for storing data objects.
6 The system provides a data store that stores data objects in a distributed system. The data store can be accessed by the client once the client receives the locators necessary to retrieve/access data objects. With the addition of new resources, they are created, allocated (in some manner, for example an IP address), and set up (e.g., with an IP address).
7 A searchable index stored in a database.
8 A searchable index stores metadata about data objects stored in the data store. It is queried by the client to retrieve locators (i.e., metadata), which is used by the client to access the data store and retrieve data objects. With the addition of new resources, they are created, allocated (in some manner, for example an IP address), and set up (e.g., with an IP address).
9 Coordinator nodes routes client requests to appropriate node to process the file appropriately, such as read or write.
10 The coordinate node queries a local storage node locator (i.e., database service) to map identifier and bucket information (i.e., file system metadata) to find the particular storage node to find the desired data or to store the data.
11 The coordinator node accesses the storage node using the identifier and bucket information (i.e., according to the file system metadata…) to perform the read/write request for the client (i.e., access the file from the object storage service).
12 Replicas across the storage nodes exhibit eventual consistency.
13 The scan is performed periodically.
14 The admin console allows an administrator the ability to detect and correct issues with the system. As modified by Vermeulen, an administrator would be able to check for inconsistencies and resolve them.
15 The system provides a distributed data storage and a database that stores a searchable index. Admin console permits an administrator to configure how the searchable data service is provided. The admin console may be implemented in a distributed architecture with remote consoles on a host system (i.e., via a distributed computing service of a network accessible service provider).
16 By using the admin console, the administrator can change the system, such as adding or removing resources (i.e., receiving a request …).
17 The system is configured by the admin through the console for performing the functions of the system.
18 The searchable data service is implemented on a distributed system comprising various nodes (i.e., one or more compute nodes). With the addition of new resources, they are created, allocated (in some manner, for example an IP address), and set up (e.g., with an IP address).
19 The distributed system provides storage for storing data objects.
20 The system provides a data store that stores data objects in a distributed system. The data store can be accessed by the client once the client receives the locators necessary to retrieve/access data objects. With the addition of new resources, they are created, allocated (in some manner, for example an IP address), and set up (e.g., with an IP address).
21 A searchable index stored in a database.
22 A searchable index stores metadata about data objects stored in the data store. It is queried by the client to retrieve locators (i.e., metadata), which is used by the client to access the data store and retrieve data objects. With the addition of new resources, they are created, allocated (in some manner, for example an IP address), and set up (e.g., with an IP address).
23 Coordinator nodes routes client requests to appropriate node to process the file appropriately, such as read or write.
24 The coordinate node queries a local storage node locator (i.e., database service) to map identifier and bucket information (i.e., file system metadata) to find the particular storage node to find the desired data or to store the data.
25 The coordinator node accesses the storage node using the identifier and bucket information (i.e., according to the file system metadata…) to perform the read/write request for the client (i.e., access the file from the object storage service).
26 Admin console permits an administrator to configure how the searchable data service is provided.
27 By using the admin console, the administrator can change the system, such as adding or removing resources (i.e., receiving a request …).
28 The system is configured by the admin through the console for performing the functions of the system.
29 The searchable data service is implemented on a distributed system comprising various nodes (i.e., one or more compute nodes). With the addition of new resources, they are created, allocated (in some manner, for example an IP address), and set up (e.g., with an IP address).
30 The distributed system provides storage for storing data objects.
31 The system provides a data store that stores data objects in a distributed system. The data store can be accessed by the client once the client receives the locators necessary to retrieve/access data objects. With the addition of new resources, they are created, allocated (in some manner, for example an IP address), and set up (e.g., with an IP address).
32 A searchable index stored in a database.
33 A searchable index stores metadata about data objects stored in the data store. It is queried by the client to retrieve locators (i.e., metadata), which is used by the client to access the data store and retrieve data objects. With the addition of new resources, they are created, allocated (in some manner, for example an IP address), and set up (e.g., with an IP address).
34 Coordinator nodes routes client requests to appropriate node to process the file appropriately, such as read or write.
35 The coordinate node queries a local storage node locator (i.e., database service) to map identifier and bucket information (i.e., file system metadata) to find the particular storage node to find the desired data or to store the data.
36 The coordinator node accesses the storage node using the identifier and bucket information (i.e., according to the file system metadata…) to perform the read/write request for the client (i.e., access the file from the object storage service).
37 Amazon discloses EMR, which is a service (i.e., network accessible service provider) to run managed Hadoop clusters (i.e., distributed computing system). The EMR allows users to provision a hadoop cluster with user defined selections and actions (i.e., configuration input), such as through a console or CLI. Cited pages 34-134 discuss configuration of the cluster.
38 Pages 17-24 discuss the steps for submitting a job flow (i.e., request) to the system by the user to launch a cluster (i.e., provision the distributed computing system).
39 The cluster (i.e., distributed computing system) is launched in response to the job flow (i.e., request to provision) submitted by the user.
40 A cluster configured according to user selected parameters, is launched (i.e., provisioned). The cluster includes one or more instances of nodes to perform tasks (i.e., compute nodes).
41 Amazon S3 or HDFS can be used to store data in the cluster. The cited pages show user configuration screens for provisioning S3 and uploading data to it. S3 is an object storage that stores files in a directory structure.
42 Hive and DynamoDB are provisioned as part of the cluster. A metastore stores metadata in a key value store for quickly accessing the object storage (i.e., S3).
43 Hardware configurations of physical servers.
44 Hardware configurations of physical servers include memory or storage (i.e., CRMs).
45 S3 (i.e., object storage) can be implemented as eventual consistency (i.e., not guaranteed to latest version …).
46 The scan is performed periodically.
47 Admin can set consistency flags to make sure consistency checks are performed to resolve inconsistencies. As modified by Vermeulen, an administrator would be able to check for inconsistencies and resolve them.
48 Cache includes information that can be quickly accessed for files.