DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending in this application.
Response to Amendment
This Office Action is in response to applicant’s communication filed on January 12th, 2026. The applicant’s remark and amendments to the claims were considered with the results that follow.
In response to the last Office Action, no claims have been canceled or amended. As a result, claims 1-20 are pending in this application.
Response to Arguments
The applicant’s arguments and amendments, see pg. 14 filed on January 12th, 2026 have been fully considered but they are not persuasive because the claims are obvious variant. The non-statutory double patenting rejection over application 16,600,106 is maintained. A terminal disclaimer has not been filed yet.
Applicant’s arguments, see pg. 15-18, filed on January 12th, 2026, with respect to the rejection of independent claims 1, 8, and 15 as amended under 35 U.S.C 103, where the applicant asserts that the prior arts do not teach or suggest the alleged steps of “partially re-indexing, by the server of the cloud-based storage system, the plurality of shards of the one or more clusters based on addition of a node to the plurality of node in the one or more clusters as recited in each pending independent claim, i.e., wherein partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards”.
Examiner respectfully disagrees. Applicant indicates that ISHERWOOD states and requires that all background processing to be suspended during node addition remapping. The claim limitations do not indicate any feature that excludes background processing to be suspended during the node addition and remapping. The current claim limitation specify, “partially re-indexing, by the server of the cloud-based storage system, the plurality of shards of the one or more clusters based on addition of a node to the plurality of node in the one or more clusters”. Examiner correlates this limitation as partially reindexing information based on an addition of a node to the plurality of the node in the cluster.
Additionally, ISHERWOOD teaches this on [0004]-[0005]; This technique is used to manage a database search index to achieve high availability and allow for node expansion/reduction without requiring the regeneration of the whole index. 1. When a node is added to the cluster, there are no new regions added to the cluster. Instead, the region indices are redistributed amongst all the nodes in the cluster to help distribute the work load. [0082]; On node addition, no new regions/shards will be added to the cluster/system, but the node addition remapping module will redistribute the shard cores throughout the cluster to distribute the work load). Accordingly, redistribution of region indices across nodes in a cluster is form of partial reindexing as redistributing of region is not a full reindex and is a form partial indexing among the nodes according based on the node addition would cause to redistribute accordingly.
Additionally, applicant indicates Isherwood fail to teach or
suggest the recited redirecting of traffic while the shards are being re-indexed, but Isherwood explicitly teaches away from such redirection.
Examiner respectfully disagree. The claims do not indicate such matter on that. The claim recite, “redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed”.
This limitation is indicating redirect traffic of the selected to a different cluster while the other are being partially reindexed. This is a form of a shard being reconstructed thus, allow the traffic of the other shards to redistribute to another set of nodes to prevent disturbing the work.
Additionally, ISHERWOOD still teach the limitation of, “redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed”.
ISHERWOOD teaches the redirecting traffic on [0082], “On node addition, no new regions/shards will be added to the cluster/system, but the node addition remapping module will redistribute the shard cores throughout the cluster to distribute the work load. In step 808, it regenerates the region mapping table using the content index hash algorithm and new locations of the shard cores”. That is while the data is being remapped due to the new addition of the node this allows the workload to be distributed else where while regenerating a region map to a new location of the shard.
Accordingly, when redirecting the traffic for each shard to a different cluster. ISHERWOOD teaches the redirecting based on moving the shards based on the volume as show on [0091]-[0092]. ISHERWOOD specify on [0091]-[0092], “When placing/moving shard cores, the determination of which volume to use is based on both shard core or shard distribution and on load calculation. This ratio heavily favors volumes which have no shard cores or which have lots of free space for shard cores. The volumes are sorted by ratio (high to low) and are traversed until the module finds the best possible match (e.g., most space”. The partial reindexing comes from moving the shards from throughout the cluster while having a table to reflect those changes.
Also, ISHERWOOD additionally teaches re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards. This limitation is merely specifying to update index from old outdated shards to new updated shards by creating new locations, moving data, and upon completion, the indexes are updated to point to the new location. Accordingly, ISHERWOOD teaches on [0008], “If necessary, as when a node is added or removed/regenerated, the hash algorithm is re-evaluated to facilitate the new region distribution. With the new hash algorithm, a new region mapping table is regenerated and distributed to all nodes in the cluster”.
ISHERWOOD indicates creating a new index based on a new node, and the newly added region distribution is the indication of reindicating the traffic to the new location based on completion in which the new region mapping table would display the completion of the new region that shows the new redistribution based on the new addition of the node in the cluster.
As such, ISHERWOOD teaches the limitations as explained above.
Double Patenting
Claims 1-20 of this application is patentably indistinct from claims 1-20 of Application No. 16/600,106. Pursuant to 37 CFR 1.78(f), when two or more applications filed by the same applicant or assignee contain patentably indistinct claims, elimination of such claims from all but one application may be required in the absence of good and sufficient reason for their retention during pendency in more than one application. Applicant is required to either cancel the patentably indistinct claims from all but one application or maintain a clear line of demarcation between the applications. See MPEP § 822.
The non-statutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A non-statutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on non-statutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b).
The filing of a terminal disclaimer by itself is not a complete reply to a non-statutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
The subject matter claimed in the instant application is fully disclosed in the U.S Patent No. 11599500 and is covered by the patent since the Patent application and instant application are claiming common subject matters, as follows:
Instant Application:18/164,105(hereinafter as “Roy 105”)
Claim 1
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 1
1. A method for distributing data across a plurality of storage shards, the method comprising:
generating, by a server of a cloud-based storage system, a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters;
sorting, by the server of the cloud-based storage system, the generated file keys for each file of the plurality of files into an ordered list;
logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards;
mapping, by the server of the cloud-based storage system, each logical shard of the plurality of logical shards to one of the plurality of physical shards;
identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list;
saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped; and
partially re-indexing, by the server of the cloud-based storage system, the plurality of shards of the one or more clusters based on addition of a node to the plurality of node in the one or more clusters, wherein partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards.
1. (Currently Amended) A method for distributing data across a plurality of storage shards, the method comprising:
generating, by a server of a cloud-based storage system, a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters, wherein the file key comprises a hash of an enterprise identifier (enterprise ID) for an entity to which the creator of the file is a member, a hash of a folder identifier (folder ID) for a location in which the file is stored, and a hash of a file identifier (file ID) uniquely identifying the file;
sorting, by the server of the cloud-based storage system, the generated file keys for each file of the plurality of files into an ordered list;
logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards;
mapping, by the server of the cloud-based storage system, each logical shard of the plurality of logical shards to one of the plurality of physical shards;
identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list;
saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped; and
partially re-indexing, by the server of the cloud-based storage system, the plurality of shards of the one or more clusters based on addition of a node to the plurality of node in the one or more clusters, wherein partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards.
Claim 1 is rejected on the ground of non-statutory double patenting as being unpatentable over claim 1 of U.S Patent No. 11,599,500 (hereinafter as “Roy 500”). Although the claims at issue are not identical, they are not patentably distinct from each other.
Claim 1 of the instant application is obvious variation of claim of Roy 500 because claim 1 of the instant application is broader than claim of Roy 500, and the amended limitation in Roy 500 does not teach away from the scope of the claimed invention in claim 1 of the instant application. For example, U.S Patent 18/164,105 (hereinafter as “Roy 105”) recites almost identical claims with Roy 500 except the bolded portion from Roy 500 that recites, “generating, by a server of a cloud-based storage system, a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters, wherein the file key comprises a hash of an enterprise identifier (enterprise ID) for an entity to which the creator of the file is a member, a hash of a folder identifier (folder ID) for a location in which the file is stored, and a hash of a file identifier (file ID) uniquely identifying the file”.
The claimed difference would be obvious to the person of ordinary skill in the art, because the instant claims are merely broader and/or alternate variations of the claim recited in the patent application. Because the instant claims merely add/modify the additional elements from the set of elements and functions claimed in the parent application, such modification would be readily apparent to a person of the ordinary skill.
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention was made to omit/add/modify the additional element of claim 1 of Roy 500 to arrive at claim 1 of the instant application because the person would have realized that the remaining element would perform the same function as before. Therefore, it would have been obvious to modify the instant claims to adjust to include a alternative variation of “generating, by a server of a cloud-based storage system, a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters” to come to the same decision in performing the same set of function and steps as previously presented.
Instant Application:18/164,105(hereinafter as “Roy 105”)
Claim 2
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 2
2. The method of claim 1, further comprising indexing a new file in the plurality of physical shards, wherein the indexing comprises: receiving, by the server of the cloud-based storage system, the new file; generating, by the server of the cloud-based storage system, a file key for the new file; identifying, by the server of the cloud-based storage system, a target shard for the new file based on the generated file key for the new file and the saved identified last key value for each logical shard saved in the meta-store for each physical shard; and routing, by the server of the cloud-based storage system, the new file to the identified target shard.
2. (Original) The method of claim 1, further comprising indexing a new file in the plurality of physical shards, wherein the indexing comprises: receiving, by the server of the cloud-based storage system, the new file; generating, by the server of the cloud-based storage system, a file key for the new file; identifying, by the server of the cloud-based storage system, a target shard for the new file based on the generated file key for the new file and the saved identified last key value for each logical shard saved in the meta-store for each physical shard; and routing, by the server of the cloud-based storage system, the new file to the identified target shard.
Claim 2 is dependent on claim 1 recite similar limitations to claim 2 of Roy 500, therefore, claim 2 is rejected for similar reasons as recited above.
Instant Application:18/164,105(hereinafter as “Roy 105”)
Claim 3
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 3
3. The method of claim 1, further comprising processing a query from a user, wherein processing the query comprises: receiving, by the server of the cloud-based storage system, an enterprise ID associated with the user; identifying, by the server of the cloud-based storage system, one or more shards based on the received enterprise ID associated with the user and the saved identified last key value for each logical shard saved in the meta-store for each physical shard; and routing, by the server of the cloud-based storage system, the query to the identified one or more shards.
3. (Original) The method of claim 1, further comprising processing a query from a user, wherein processing the query comprises: receiving, by the server of the cloud-based storage system, an enterprise ID associated with the user; identifying, by the server of the cloud-based storage system, one or more shards based on the received enterprise ID associated with the user and the saved identified last key value for each logical shard saved in the meta-store for each physical shard; and routing, by the server of the cloud-based storage system, the query to the identified one or more shards.
Claim 3 is dependent on claim 1 recite similar limitations to claim 3 of Silk 500, therefore, claim 3 is rejected for similar reasons as recited above.
Instant Application:18/164,105 (hereinafter as “Roy 105”)
Claim 4
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 4
4. The method of claim 1, wherein mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards further comprises: sorting, by the server of the cloud-based storage system, the plurality of logical shards in descending order based on a load of each logical shard; assigning, by the server of the cloud-based storage system, one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards; sorting, by the server of the cloud-based storage system, remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard; assigning, by the server of the cloud-based storage system, one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards; determining, by the server of the cloud-based storage system, whether all logical shards have been assigned to a physical shard; and in response to determining not all logical shards have been assigned to a physical shard, repeating, by the server of the cloud-based storage system, said sorting of the plurality of logical shards in descending order based on a load of each logical shard, assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards, sorting of remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard, and assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards until all logical shards have been assigned to a physical shard.
4. (Original) The method of claim 1, wherein mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards further comprises: sorting, by the server of the cloud-based storage system, the plurality of logical shards in descending order based on a load of each logical shard; assigning, by the server of the cloud-based storage system, one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards; sorting, by the server of the cloud-based storage system, remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard; assigning, by the server of the cloud-based storage system, one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards; determining, by the server of the cloud-based storage system, whether all logical shards have been assigned to a physical shard; and in response to determining not all logical shards have been assigned to a physical shard, repeating, by the server of the cloud-based storage system, said sorting of the plurality of logical shards in descending order based on a load of each logical shard, assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards, sorting of remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard, and assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards until all logical shards have been assigned to a physical shard.
Claim 4 is dependent on claim 1 recite similar limitations to claim 4 of Roy 500, therefore, claim 4 is rejected for similar reasons as recited above.
Instant Application:18/164,105 (hereinafter as “Roy 105”)
Claim 5
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 5
5. The method of claim 1, further comprising spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards, wherein spilling over the source shard to the target shard comprises:
identifying, by the server of the cloud-based storage system, the source shard based on available storage capacity of each physical shard of the plurality of physical shards; identifying, by the server of the cloud-based storage system, the target shard based on the available storage capacity of each physical shard of the plurality of physical shards; spilling, by the server of the cloud-based storage system, content from the source shard to the target shard; updating, by the server of the cloud-based storage system, an override map of the meta- store for the source shard and the target shard; and federating, by the server of the cloud-based storage system, the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud- based storage system.
5. (Original) The method of claim 1, further comprising spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards, wherein spilling over the source shard to the target shard comprises:
identifying, by the server of the cloud-based storage system, the source shard based on available storage capacity of each physical shard of the plurality of physical shards; identifying, by the server of the cloud-based storage system, the target shard based on the available storage capacity of each physical shard of the plurality of physical shards; spilling, by the server of the cloud-based storage system, content from the source shard to the target shard; updating, by the server of the cloud-based storage system, an override map of the meta- store for the source shard and the target shard; and federating, by the server of the cloud-based storage system, the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud- based storage system.
Claim 5 is dependent on claim 1 recite similar limitations to claim 5 of Roy 500, therefore, claim 5 is rejected for similar reasons as recited above.
Instant Application:18/164,105 (hereinafter as “Roy 105”)
Claim 6
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 6
6. The method of claim 1, further comprising spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards, wherein spilling over the source shard to the target shard comprises:
identifying, by the server of the cloud-based storage system, available space for each node of the plurality of node in a cluster of the one or more clusters; selecting, by the server of the cloud-based storage system, a plurality of target shards based on an overhead capacity of each shard of the plurality of shards on a node of the plurality of node having a most available space; identifying, by the server of the cloud-based storage system, one or more source shards based on one or more overhead criteria; selecting, by the server of the cloud-based storage system, one of the identified one or more source shards; spilling, by the server of the cloud-based storage system, the selected one of the identified one or more source shards to the selected plurality of target shards; updating, by the server of the cloud-based storage system, an override map of the meta- store for the source shard and each target shard; federating, by the server of the cloud-based storage system, the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system; determining, by the server of the cloud-based storage system, whether all of the identified one or more source shards have been spilled to the plurality of target shards; and in response to determining not all of the identified one or more source shards have been spilled to the plurality of target shards, repeating, by the server of the cloud-based storage system, the selecting one of the identified one or more source shards, spilling the selected one of the identified one or more source shards to the selected plurality of target shards, updating the override map of the meta-store for the source shard and each target shard, and federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system until all of the identified one or more source shards have been spilled to the plurality of target shards.
6. (Original) The method of claim 1, further comprising spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards, wherein spilling over the source shard to the target shard comprises:
identifying, by the server of the cloud-based storage system, available space for each node of the plurality of node in a cluster of the one or more clusters; selecting, by the server of the cloud-based storage system, a plurality of target shards based on an overhead capacity of each shard of the plurality of shards on a node of the plurality of node having a most available space; identifying, by the server of the cloud-based storage system, one or more source shards based on one or more overhead criteria; selecting, by the server of the cloud-based storage system, one of the identified one or more source shards; spilling, by the server of the cloud-based storage system, the selected one of the identified one or more source shards to the selected plurality of target shards; updating, by the server of the cloud-based storage system, an override map of the meta- store for the source shard and each target shard; federating, by the server of the cloud-based storage system, the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system; determining, by the server of the cloud-based storage system, whether all of the identified one or more source shards have been spilled to the plurality of target shards; and in response to determining not all of the identified one or more source shards have been spilled to the plurality of target shards, repeating, by the server of the cloud-based storage system, the selecting one of the identified one or more source shards, spilling the selected one of the identified one or more source shards to the selected plurality of target shards, updating the override map of the meta-store for the source shard and each target shard, and federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system until all of the identified one or more source shards have been spilled to the plurality of target shards.
Claim 6 is dependent on claim 1 recite similar limitations to claim 6 of Roy 500 therefore, claim 6 is rejected for similar reasons as recited above.
Instant Application:18/164,105 (hereinafter as “Roy 105”)
Claim 7
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 7
7. The method of claim 1, wherein partially re-indexing the plurality of shards further comprises:
splitting, by the server of the cloud-based storage system, a key range for each shard of the selected set of largest shards in half;
generating, by the server of the cloud-based storage system, a new shard table for the cluster containing the selected set of largest shards; indexing, by the server of the cloud-based storage system, a first half of the key range for each shard of the selected set of largest shards on a source node for the shard;
indexing, by the server of the cloud-based storage system, a second half of the key range for each shard of the selected set of largest shards on a node different from the source node for the shard; and
updating, by the server of the cloud-based storage system, cluster configuration information and shard key ranges in the meta-store of each node based on the partial re-indexing.
7. (Currently Amended) The method of claim 1, wherein partially re-indexing the plurality of shards further comprises:
splitting, by the server of the cloud-based storage system, a key range for each shard of the selected set of largest shards in half;
generating, by the server of the cloud-based storage system, a new shard table for the cluster containing the selected set of largest shards; indexing, by the server of the cloud-based storage system, a first half of the key range for each shard of the selected set of largest shards on a source node for the shard;
indexing, by the server of the cloud-based storage system, a second half of the key range for each shard of the selected set of largest shards on a node different from the source node for the shard; and
updating, by the server of the cloud-based storage system, cluster configuration information and shard key ranges in the meta-store of each node based on the partial re-indexing; and
Claim 7 is dependent on claim 1 recite similar limitations to claim 7 of Roy 500 therefore, claim 7 is rejected for similar reasons as recited above.
Instant Application:18/164,105 (hereinafter as “Roy 105”)
Claim 8
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 8
8. A server of a cloud-based storage system, the server comprising:
a processor; and a memory coupled with and readable by the processor and storing therein a set of instructions which, when executed by the processor, causes the processor to distribute data across a plurality of storage shards by:
generating a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters;
sorting the generated file keys for each file of the plurality of files into an ordered list; logically partitioning the ordered list into a plurality of logical shards;
mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards;
identifying a last key value for each logical shard in the partitioned ordered list; saving the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped; and
partially re-indexing the plurality of shards of the one or more clusters based on addition of a node to the plurality of node in the one or more clusters, wherein partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards.
8. (Currently Amended) A server of a cloud-based storage system, the server comprising:
a processor; and a memory coupled with and readable by the processor and storing therein a set of instructions which, when executed by the processor, causes the processor to distribute data across a plurality of storage shards by:
generating a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters, wherein the file key comprises a hash of an enterprise identifier (enterprise ID) for an entity to which the creator of the file is a member, a hash of a folder identifier (folder ID) for a location in which the file is stored, and a hash of a file identifier (file ID) uniquely identifying the file;
sorting the generated file keys for each file of the plurality of files into an ordered list; logically partitioning the ordered list into a plurality of logical shards;
mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards;
identifying a last key value for each logical shard in the partitioned ordered list; saving the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped; and
partially re-indexing the plurality of shards of the one or more clusters based on addition of a node to the plurality of node in the one or more clusters, wherein partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards.
Claim 8 is rejected on the ground of non-statutory double patenting as being unpatentable over claim 8 of U.S Patent No. 11,599,500 (hereinafter as “Roy 500”). Although the claims at issue are not identical, they are not patentably distinct from each other.
Claim 8 of the instant application is obvious variation of claim of Roy 500 because claim 8 of the instant application is broader than claim of Roy 500, and the amended limitation in Roy 500 does not teach away from the scope of the claimed invention in claim 8 of the instant application. For example, U.S Patent 18/164,105 (hereinafter as “Roy 105”) recites almost identical claims with Roy 500 except the bolded portion from Roy 500 that recites, “generating, by a server of a cloud-based storage system, a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters, wherein the file key comprises a hash of an enterprise identifier (enterprise ID) for an entity to which the creator of the file is a member, a hash of a folder identifier (folder ID) for a location in which the file is stored, and a hash of a file identifier (file ID) uniquely identifying the file”.
The claimed difference would be obvious to the person of ordinary skill in the art, because the instant claims are merely broader and/or alternate variations of the claim recited in the patent application. Because the instant claims merely add/modify the additional elements from the set of elements and functions claimed in the parent application, such modification would be readily apparent to a person of the ordinary skill.
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention was made to omit/add/modify the additional element of claim 8 of Roy 500 to arrive at claim 8 of the instant application because the person would have realized that the remaining element would perform the same function as before. Therefore, it would have been obvious to modify the instant claims to adjust to include a alternative variation of “generating, by a server of a cloud-based storage system, a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters” to come to the same decision in performing the same set of function and steps as previously presented.
Instant Application:18/164,105 (hereinafter as “Roy 105”)
Claim 9
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 9
9. The server of claim 8, wherein the instructions further cause the processor to index a new file in the plurality of physical shards by: receiving the new file; generating a file key for the new file; identifying a target shard for the new file based on the generated file key for the new file and the saved identified last key value for each logical shard saved in the meta-store for each physical shard; and routing the new file to the identified target shard.
9. (Original) The server of claim 8, wherein the instructions further cause the processor to index a new file in the plurality of physical shards by: receiving the new file; generating a file key for the new file; identifying a target shard for the new file based on the generated file key for the new file and the saved identified last key value for each logical shard saved in the meta-store for each physical shard; and routing the new file to the identified target shard.
Claim 9 is dependent on claim 8 recite similar limitations to claim 9 of Roy 500, therefore, claim 9 is rejected for similar reasons as recited above.
Instant Application:18/164,105 (hereinafter as “Roy 105”)
Claim 10
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 16
10. The server of claim 8, wherein the instructions further cause the processor to process a query from a user by: receiving an enterprise ID associated with the user; identifying one or more shards based on the received enterprise ID associated with the user and the saved identified last key value for each logical shard saved in the meta-store for each physical shard; and routing the query to the identified one or more shards.
10. (Original) The server of claim 8, wherein the instructions further cause the processor to process a query from a user by: receiving an enterprise ID associated with the user; identifying one or more shards based on the received enterprise ID associated with the user and the saved identified last key value for each logical shard saved in the meta-store for each physical shard; and routing the query to the identified one or more shards.
Claim 10 is dependent on claim 8 recite similar limitations to claim 10 of Roy 500, therefore, claim 10 is rejected for similar reasons as recited above.
Instant Application:18/164,105 (hereinafter as “Roy 105”)
Claim 11
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 11
11. The server of claim 8, wherein mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards further comprises:
sorting the plurality of logical shards in descending order based on a load of each logical shard; assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards; sorting remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard; assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards; determining whether all logical shards have been assigned to a physical shard; and in response to determining not all logical shards have been assigned to a physical shard, repeating said sorting of the plurality of logical shards in descending order based on a load of each logical shard, assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards, sorting of remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard, and assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards until all logical shards have been assigned to a physical shard.
11. (Original) The server of claim 8, wherein mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards further comprises:
sorting the plurality of logical shards in descending order based on a load of each logical shard; assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards; sorting remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard; assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards; determining whether all logical shards have been assigned to a physical shard; and in response to determining not all logical shards have been assigned to a physical shard, repeating said sorting of the plurality of logical shards in descending order based on a load of each logical shard, assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards, sorting of remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard, and assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards until all logical shards have been assigned to a physical shard.
Claim 11 is dependent on independent claim 8 recite similar limitations to claim 11 of Roy 500, therefore, claim 11 is rejected for similar reasons as recited above.
Instant Application:18/164,105 (hereinafter as “Roy 105”)
Claim 12
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 12
12. The server of claim 8, wherein the instructions further cause the processor to spill over a source shard of the plurality of physical shards to a target shard of the plurality of target shards by:
identifying the source shard based on available storage capacity of each physical shard of the plurality of physical shards; identifying the target shard based on the available storage capacity of each physical shard of the plurality of physical shards; spilling content from the source shard to the target shard; updating an override map of the meta-store for the source shard and the target shard; and federating the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage system.
12. (Original) The server of claim 8, wherein the instructions further cause the processor to spill over a source shard of the plurality of physical shards to a target shard of the plurality of target shards by:
identifying the source shard based on available storage capacity of each physical shard of the plurality of physical shards; identifying the target shard based on the available storage capacity of each physical shard of the plurality of physical shards; spilling content from the source shard to the target shard; updating an override map of the meta-store for the source shard and the target shard; and federating the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage system.
Claim 12 is dependent on independent claim 8 recite similar limitations to claim 12 of Roy 500, therefore, claim 12 is rejected for similar reasons as recited above.
Instant Application:18/164,105 (hereinafter as “Roy 105”)
Claim 13
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 13
13. The server of claim 8, wherein the instructions further cause the processor to spill over a source shard of the plurality of physical shards to a target shard of the plurality of target shards by:
identifying available space for each node of the plurality of node in a cluster of the one or more clusters;
selecting a plurality of target shards based on an overhead capacity of each shard of the plurality of shards on a node of the plurality of node having a most available space;
identifying one or more source shards based on one or more overhead criteria; selecting one of the identified one or more source shards;
spilling the selected one of the identified one or more source shards to the selected plurality of target shards;
updating an override map of the meta-store for the source shard and each target shard;
federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system;
determining whether all of the identified one or more source shards have been spilled to the plurality of target shards; and
in response to determining not all of the identified one or more source shards have been spilled to the plurality of target shards, repeating the selecting one of the identified one or more source shards, spilling the selected one of the identified one or more source shards to the selected plurality of target shards, updating the override map of the meta-store for the source shard and each target shard, and federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system until all of the identified one or more source shards have been spilled to the plurality of target shards.
13. (Original) The server of claim 8, wherein the instructions further cause the processor to spill over a source shard of the plurality of physical shards to a target shard of the plurality of target shards by:
identifying available space for each node of the plurality of node in a cluster of the one or more clusters;
selecting a plurality of target shards based on an overhead capacity of each shard of the plurality of shards on a node of the plurality of node having a most available space;
identifying one or more source shards based on one or more overhead criteria; selecting one of the identified one or more source shards;
spilling the selected one of the identified one or more source shards to the selected plurality of target shards;
updating an override map of the meta-store for the source shard and each target shard;
federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system;
determining whether all of the identified one or more source shards have been spilled to the plurality of target shards; and
in response to determining not all of the identified one or more source shards have been spilled to the plurality of target shards, repeating the selecting one of the identified one or more source shards, spilling the selected one of the identified one or more source shards to the selected plurality of target shards, updating the override map of the meta-store for the source shard and each target shard, and federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system until all of the identified one or more source shards have been spilled to the plurality of target shards.
Claim 13 dependent on dependent on independent claim 8 recite similar limitations to claims 13 of Roy 500, therefore, claim 13 is rejected for similar reasons as recited above.
Instant Application:18/164,105 (hereinafter as “Roy 105”)
Claim 14
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 14
14. The server of claim 8, wherein partially re-indexing the plurality of shards of the one or more clusters further comprises:
splitting a key range for each shard of the selected set of largest shards in half, generating a new shard table for the cluster containing the selected set of largest shards; indexing a first half of the key range for each shard of the selected set of largest shards on a source node for the shard; indexing a second half of the key range for each shard of the selected set of largest shards on a node different from the source node for the shard; and
updating cluster configuration information and shard key ranges in the meta-store of each node based on the partial re-indexing.
14. (Currently Amended) The server of claim 8, wherein partially re-indexing the plurality of shards of the one or more clusters further comprises:
splitting a key range for each shard of the selected set of largest shards in half, generating a new shard table for the cluster containing the selected set of largest shards; indexing a first half of the key range for each shard of the selected set of largest shards on a source node for the shard; indexing a second half of the key range for each shard of the selected set of largest shards on a node different from the source node for the shard; and
updating cluster configuration information and shard key ranges in the meta-store of each node based on the partial re-indexing;
Claim 14 dependent on dependent on independent claim 8 recite similar limitations to claim 14 of Roy 500, therefore, claim 14 is rejected for similar reasons as recited above.
Instant Application:18/164,105 (hereinafter as “Roy 105”)
Claim 15
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 15
15. A non-transitory, computer-readable medium comprising a set of instructions stored therein which, when executed by a processor, causes the processor to distribute data across a plurality of storage shards by:
generating a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters;
sorting the generated file keys for each file of the plurality of files into an ordered list; logically partitioning the ordered list into a plurality of logical shards; mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards; identifying a last key value for each logical shard in the partitioned ordered list; saving the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped; and partially re-indexing the plurality of shards of the one or more clusters based on addition of a node to the plurality of node in the one or more clusters, wherein partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards.
15. (Currently Amended) A non-transitory, computer-readable medium comprising a set of instructions stored therein which, when executed by a processor, causes the processor to distribute data across a plurality of storage shards by:
generating a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters, wherein the file key comprises a hash of an enterprise identifier (enterprise ID) for an entity to which the creator of the file is a member, a hash of a folder identifier (folder ID) for a location in which the file is stored, and a hash of a file identifier (file ID) uniquely identifying the file;
sorting the generated file keys for each file of the plurality of files into an ordered list; logically partitioning the ordered list into a plurality of logical shards; mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards; identifying a last key value for each logical shard in the partitioned ordered list; saving the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped; and partially re-indexing the plurality of shards of the one or more clusters based on addition of a node to the plurality of node in the one or more clusters, wherein partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards.
Claim 15 is rejected on the ground of non-statutory double patenting as being unpatentable over claim 15 of U.S Patent No. 11,599,500 (hereinafter as “Roy 500”). Although the claims at issue are not identical, they are not patentably distinct from each other.
Claim 15 of the instant application is obvious variation of claim of Roy 500 because claim 15 of the instant application is broader than claim of Roy 500, and the amended limitation in Roy 500 does not teach away from the scope of the claimed invention in claim 15 of the instant application. For example, U.S Patent 18/164,105 (hereinafter as “Roy 105”) recites almost identical claims with Roy 500 except the bolded portion from Roy 500 that recites, “generating, by a server of a cloud-based storage system, a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters, wherein the file key comprises a hash of an enterprise identifier (enterprise ID) for an entity to which the creator of the file is a member, a hash of a folder identifier (folder ID) for a location in which the file is stored, and a hash of a file identifier (file ID) uniquely identifying the file”.
The claimed difference would be obvious to the person of ordinary skill in the art, because the instant claims are merely broader and/or alternate variations of the claim recited in the patent application. Because the instant claims merely add/modify the additional elements from the set of elements and functions claimed in the parent application, such modification would be readily apparent to a person of the ordinary skill.
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention was made to omit/add/modify the additional element of claim 15 of Roy 500 to arrive at claim 15 of the instant application because the person would have realized that the remaining element would perform the same function as before. Therefore, it would have been obvious to modify the instant claims to adjust to include a alternative variation of “generating, by a server of a cloud-based storage system, a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters” to come to the same decision in performing the same set of function and steps as previously presented.
Instant Application:18/164,105 (hereinafter as “Roy 105”)
Claim 16
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 16
16. The non-transitory, computer-readable medium of claim 15, wherein the instructions further cause the processor to index a new file in the plurality of physical shards by:
receiving the new file; generating a file key for the new file; identifying a target shard for the new file based on the generated file key for the new file and the saved identified last key value for each logical shard saved in the meta-store for each physical shard; and routing the new file to the identified target shard.
16. (Original) The non-transitory, computer-readable medium of claim 15, wherein the instructions further cause the processor to index a new file in the plurality of physical shards by:
receiving the new file; generating a file key for the new file; identifying a target shard for the new file based on the generated file key for the new file and the saved identified last key value for each logical shard saved in the meta-store for each physical shard; and routing the new file to the identified target shard.
Claim 16 dependent on independent claim 15 recite similar limitations to claims 16 of Roy 500, therefore, claim 16 is rejected for similar reasons as recited above.
Instant Application:18/164,105 (hereinafter as “Roy 105”)
Claim 17
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 17
17. The non-transitory, computer-readable medium of claim 15, wherein the instructions further cause the processor to process a query from a user by:
receiving an enterprise ID associated with the user; identifying one or more shards based on the received enterprise ID associated with the user and the saved identified last key value for each logical shard saved in the meta-store for each physical shard; and routing the query to the identified one or more shards.
17. (Original) The non-transitory, computer-readable medium of claim 15, wherein the instructions further cause the processor to process a query from a user by:
receiving an enterprise ID associated with the user; identifying one or more shards based on the received enterprise ID associated with the user and the saved identified last key value for each logical shard saved in the meta-store for each physical shard; and routing the query to the identified one or more shards.
Claim 17 dependent on independent claim 15 recite similar limitations to claim 17 of Roy 500, therefore, claim 17 is rejected for similar reasons as recited above.
Instant Application:18/164,105 (hereinafter as “Roy 105”)
Claim 18
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 18
18. The non-transitory, computer-readable medium of claim 15, wherein mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards further comprises:
sorting the plurality of logical shards in descending order based on a load of each logical shard; assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards; sorting remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard; assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards; determining whether all logical shards have been assigned to a physical shard; and in response to determining not all logical shards have been assigned to a physical shard, repeating said sorting of the plurality of logical shards in descending order based on a load of each logical shard, assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards, sorting of remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard, and assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards until all logical shards have been assigned to a physical shard.
18. (Original) The non-transitory, computer-readable medium of claim 15, wherein mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards further comprises:
sorting the plurality of logical shards in descending order based on a load of each logical shard; assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards; sorting remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard; assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards; determining whether all logical shards have been assigned to a physical shard; and in response to determining not all logical shards have been assigned to a physical shard, repeating said sorting of the plurality of logical shards in descending order based on a load of each logical shard, assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards, sorting of remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard, and assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards until all logical shards have been assigned to a physical shard.
Claim 18 dependent on independent claim 15 recite similar limitations to claims 18 of Roy 500, therefore, claim 18 is rejected for similar reasons as recited above.
Instant Application:18/164,105 (hereinafter as “Roy 105”)
Claim 19
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 19
19. The non-transitory, computer-readable medium of claim 15, wherein the instructions further cause the processor to spill over a source shard of the plurality of physical shards to a target shard of the plurality of target shards by:
identifying the source shard based on available storage capacity of each physical shard of the plurality of physical shards; identifying the target shard based on the available storage capacity of each physical shard of the plurality of physical shards; spilling content from the source shard to the target shard; updating an override map of the meta-store for the source shard and the target shard; and federating the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage system.
19. (Original) The non-transitory, computer-readable medium of claim 15, wherein the instructions further cause the processor to spill over a source shard of the plurality of physical shards to a target shard of the plurality of target shards by:
identifying the source shard based on available storage capacity of each physical shard of the plurality of physical shards; identifying the target shard based on the available storage capacity of each physical shard of the plurality of physical shards; spilling content from the source shard to the target shard; updating an override map of the meta-store for the source shard and the target shard; and federating the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage system.
Claim 19 dependent on independent claim 15 recite similar limitations to claim 19 of Roy 500, therefore, claim 19 is rejected for similar reasons as recited above.
Instant Application:18/164,105 (hereinafter as “Roy 105”)
Claim 20
U.S Patent No. 11,599,500 (hereinafter as “Roy 500”)
Claim 20
20. The non-transitory, computer-readable medium of claim 15, wherein the instructions further cause the processor to spill over a source shard of the plurality of physical shards to a target shard of the plurality of target shards by:
identifying available space for each node of the plurality of node in a cluster of the one or more clusters; selecting a plurality of target shards based on an overhead capacity of each shard of the plurality of shards on a node of the plurality of node having a most available space; identifying one or more source shards based on one or more overhead criteria; selecting one of the identified one or more source shards; spilling the selected one of the identified one or more source shards to the selected plurality of target shards; updating an override map of the meta-store for the source shard and each target shard; federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system; determining whether all of the identified one or more source shards have been spilled to the plurality of target shards; and in response to determining not all of the identified one or more source shards have been spilled to the plurality of target shards, repeating the selecting one of the identified one or more source shards, spilling the selected one of the identified one or more source shards to the selected plurality of target shards, updating the override map of the meta-store for the source shard and each target shard, and federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system until all of the identified one or more source shards have been spilled to the plurality of target shard.
20. (Original) The non-transitory, computer-readable medium of claim 15, wherein the instructions further cause the processor to spill over a source shard of the plurality of physical shards to a target shard of the plurality of target shards by:
identifying available space for each node of the plurality of node in a cluster of the one or more clusters; selecting a plurality of target shards based on an overhead capacity of each shard of the plurality of shards on a node of the plurality of node having a most available space; identifying one or more source shards based on one or more overhead criteria; selecting one of the identified one or more source shards; spilling the selected one of the identified one or more source shards to the selected plurality of target shards; updating an override map of the meta-store for the source shard and each target shard; federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system; determining whether all of the identified one or more source shards have been spilled to the plurality of target shards; and in response to determining not all of the identified one or more source shards have been spilled to the plurality of target shards, repeating the selecting one of the identified one or more source shards, spilling the selected one of the identified one or more source shards to the selected plurality of target shards, updating the override map of the meta-store for the source shard and each target shard, and federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system until all of the identified one or more source shards have been spilled to the plurality of target shard.
Claim 20 dependent on independent claim 15 recite similar limitations to claim 20 of Roy 500, therefore, claim 20 is rejected for similar reasons as recited above.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-3, 8-10, and 15-17 are rejected under 35 U.S.C. 103 as being unpatentable over U.S Patent Application Publication 2014/0149794 issued to Shetty et al. (hereinafter as "Shetty") in view of U.S Patent Application Publication 2015/0254325 issued to Russell R. Stringham (hereinafter as “Stringham”) in further view of U.S Patent Application Publication 2014/0330785 issued to Isherwood et al. (hereinafter as “Isherwood”).
Regarding claim 1, Shetty teaches a method for distributing data across a plurality of storage shards (Shetty: [0234]; the elements of cloud 102 can be distributed and replicated among a plurality of cloud computer systems 2200 as determined to be desirable), the method comprising: generating, by a server of a cloud-based storage system, a file key for each file of a plurality of files stored in a plurality of physical shards (Shetty: [0096]; Client ID field 404 is the key field for table 400A and uniquely identifies one of clients 112(1-b) or local cloud 104. Shard information field 406 provides shard information associated with a shard of tables 400B-400G, as will be described below. Shard information field 406 contains information sufficient to identify and access the particular shard of tables 400B-400G associated with the entity identified by client ID field 404. [0122]; First field 602 of UUID 600 includes a shard identifier (e.g., an alpha-numeric key, etc.) associated with one of shard records 530(1-k), which in turn identifies one of object-filer map shards 502(1-k)
PNG
media_image1.png
315
655
media_image1.png
Greyscale
),
each physical shard maintained by a node of a plurality of nodes in one or more clusters (Shetty: [0085]; Filers 222(1-n) are storage nodes for the digital objects stored in cloud 102. Each filer 222(1-n) is very generic and includes at least one instance of a storage node service that communicates with private network 302 and facilitates storing, retrieving, and deleting objects in an associated mass data store 322);
mapping, by the server of the cloud-based storage system, each logical shard of the plurality of logical shards to one of the plurality of physical shards (Shetty: [0108]; Object-filer map table 500A and deleted object-filer map table 500B store the logical to physical (object ID to filer 222) object map. Each shard 502(1-k) can reside in any of object databases 312(1-g) and in any physical host, and there can be more than one shard 502(1-k) in each of object databases 312(1-g)
Shetty does not explicitly teach sorting, by the server of the cloud-based storage system, the generated file keys for each file of the plurality of files into an ordered list; logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards; identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list; saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped;
However, Stringham teaches sorting, by the server of the cloud-based storage system, the generated file keys for each file of the plurality of files into an ordered list (Stringham: [0051]; Once the cluster partition manager 202 selects the number of cluster partitions, and assigns the cluster partitions to clusters 106, the key space allocator 204 can assign identifiers, values, or ranges of a first key space (i.e., cluster key space) to each of the cluster partitions to help ensure even distribution of data among the clusters 106. [0054]; For example, in alternative embodiments the key can comprise a first letter of a user name associated with each piece of data. In this case, the cluster key space can comprise the letters of the alphabet. The key space allocator 204 can divide ranges of the letters of the alphabet between the cluster partitions of the database system 100);
logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards (Stringham: [0044]; The multi-cluster database management system 110 can shard or distribute data among the clusters 106 and the nodes 108 of the database system 100. [0054]; For example, in alternative embodiments the key can comprise a first letter of a user name associated with each piece of data. In this case, the cluster key space can comprise the letters of the alphabet. The key space allocator 204 can divide ranges of the letters of the alphabet between the cluster partitions of the database system 100. [0062]; For example, the key space allocator 204 can directly assign a value/identifier or range of the node key space to the nodes 108 using a key range scheme or a shard key scheme);
identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list (Stringham: [0051]; Once the cluster partition manager 202 selects the number of cluster partitions, and assigns the cluster partitions to clusters 106, the key space allocator 204 can assign identifiers, values, or ranges of a first key space (i.e., cluster key space) to each of the cluster partitions to help ensure even distribution of data among the clusters 106. [0054]; For example, in alternative embodiments the key can comprise a first letter of a user name associated with each piece of data. In this case, the cluster key space can comprise the letters of the alphabet. The key space allocator 204 can divide ranges of the letters of the alphabet between the cluster partitions of the database system 100
[0108]; In further embodiments, act 508 can involve identifying a first or last letter or number of the associated identifier {Examiner correlates to identifying the last key value as identifying the last letter of the associated identifier of the key space});
saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped (Stringham: [0051]; Moreover, the cluster key space can vary depending upon the type of data being stored and the number of clusters 106 in the database system 100. The cluster key space can be based on a key used for the data stored in the database system 100. [0054]-[0055]; The key space allocator 204 can divide ranges of the letters of the alphabet between the cluster partitions of the database system 100. In particular, if there are 13 cluster partitions, the key space allocator 204 can assign each cluster partition two letters of the cluster key space (i.e., a and b assigned to cluster partition 1 and so forth). Thus, if the database system 100 includes 10 cluster partitions, the key space allocator 204 can assign 0 and a-c to cluster partition 1, 1 and d-f to cluster partition 2, and so forth. The cluster partition manager 202 can in turn assign the 10 cluster partitions to the clusters 106 of the database system 100. [0057]; Once the key ID for the data is identified, the key ID mapper 206 can determine the applicable cluster 106 for the data and route the request or the piece of data to the applicable cluster 106. For example, the key ID mapper 206 can determine to which cluster partition the key ID corresponds, and in turn to which cluster 106 the determined cluster partition is assigned);
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify Shetty (teaches a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters mapping, by the server of the cloud-based storage system, each logical shard of the plurality of logical shards to one of the plurality of physical shards) with the teachings of Stringham (teaches logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards to identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list and saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped). One of ordinary skill in the art would have been motivated to make such a combination of providing better results in dynamically increasing the cluster partition in an efficient manner by setting a new number to allow efficient redistribution (See Stringham [0049]). In addition, the references (Shetty and Stringham) teach features that are directed to analogous art and they are directed to the same field of endeavor as Shetty and Stringham are directed to mapping and storing partition information based on a key value.
The modification of Shetty and Stringham teaches claimed invention substantially as claimed, however the modification of Shetty and Stringham does not explicitly teach partially re-indexing, by the server of the cloud-based storage system, the plurality of shards of the one or more clusters based on addition of a node to the plurality of node in the one or more clusters, wherein partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards.
However, ISHERWOOD teaches partially re-indexing, by the server of the cloud-based storage system, the plurality of shards of the one or more clusters based on addition of a node to the plurality of node in the one or more clusters (ISHERWOOD: [0004]-[0005]; This technique is used to manage a database search index to achieve high availability and allow for node expansion/reduction without requiring the regeneration of the whole index. 1. When a node is added to the cluster, there are no new regions added to the cluster. Instead, the region indices are redistributed amongst all the nodes in the cluster to help distribute the work load. [0082]; On node addition, no new regions/shards will be added to the cluster/system, but the node addition remapping module will redistribute the shard cores throughout the cluster to distribute the work load), wherein
partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard (ISHERWOOD: [0091]-[0092]; In the shard distribution and remapping processes described above, the indexing module 342, node addition remapping module, node removal remapping module, and core recovery remapping module decide where to place/move shard cores. When placing/moving shard cores, the determination of which volume to use is based on both shard core or shard distribution and on load calculation. Volume load is determined typically by a “best match” or “best fit” procedure. This ratio heavily favors volumes which have no shard cores or which have lots of free space for shard cores. The volumes are sorted by ratio (high to low) and are traversed until the module finds the best possible match (e.g., most space {Examiner correlates that the volume with the most space when moving the shard cores would indicate selecting the largest shards based on the index size to be move due to the most space}),
redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed (ISHERWOOD: [0082]; On node addition, no new regions/shards will be added to the cluster/system, but the node addition remapping module will redistribute the shard cores throughout the cluster to distribute the work load. In step 808, it regenerates the region mapping table using the content index hash algorithm and new locations of the shard cores. [0091]-[0092]; When placing/moving shard cores, the determination of which volume to use is based on both shard core or shard distribution and on load calculation. This ratio heavily favors volumes which have no shard cores or which have lots of free space for shard cores. The volumes are sorted by ratio (high to low) and are traversed until the module finds the best possible match (e.g., most space {Examiner correlates redirecting traffic to a different cluster based on moving the shard cores to volumes with a lot of space where the node that is being added is regenerated a mapping table to indicate the impacted space that have been moved}}), and
re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards (ISHERWOOD: [0008]; If necessary, as when a node is added or removed/regenerated, the hash algorithm is re-evaluated to facilitate the new region distribution. With the new hash algorithm, a new region mapping table is regenerated and distributed to all nodes in the cluster {Examiner correlates that when the new region mapping table is generated based on the node being regenerated, only the data that within the affected region are regenerated based on revaluating the new region distribution to be distributed to the other nodes in the cluster to indicate the change}).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify Shetty (teaches a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters mapping, by the server of the cloud-based storage system, each logical shard of the plurality of logical shards to one of the plurality of physical shards) with the teachings of Stringham (teaches logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards to identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list and saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped) with the further teachings of ISHERWOOD (teaches partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards). One of ordinary skill in the art would have been motivated to make such a combination of providing better results in making query indexing more efficient to prevent impact (See ISHERWOOD [0079]). In addition, the references (Shetty, Stringham, and ISHERWOOD) teach features that are directed to analogous art and they are directed to the same field of endeavor as Shetty, Stringham, and ISHERWOOD are directed to mapping and storing partition information based on replicating data.
Regarding claim 2, the modification of Shetty, Stringham, and ISHERWOOD teaches claimed invention substantially as claimed, and Stringham further teaches further comprising indexing a new file in the plurality of physical shards, wherein the indexing comprises: receiving, by the server of the cloud-based storage system, the new file (Stringham: [0045]; the multi-cluster database management system 110 can be included in each node 108 of the database system 100. For example, a table or index including a mapping scheme (described in greater detail below in reference to FIG. 3A) can reside on each node 108. The nodes 108 can use the table or index to identify where to route requests or whether to respond to requests);
generating, by the server of the cloud-based storage system, a file key for the new file (Stringham: [0051]-[0052]; Once the cluster partition manager 202 selects the number of cluster partitions, and assigns the cluster partitions to clusters 106, the key space allocator 204 can assign identifiers, values, or ranges of a first key space (i.e., cluster key space) to each of the cluster partitions to help ensure even distribution of data among the clusters 106. For example, in one embodiment the key can comprise a hash of a document ID, a user ID, the key portion of a key/value pair, or another data identifier associated with a piece of data, or a combination of multiple identifiers. [0056]; For example, the multi-cluster database management system 110 can receive a request to process a piece of data. The request to process a piece of data can comprise a request to write data to the database system 100, read data from the database system 100, update data in the database system 100, increment data values within the database management system 100, access data from the database system 100, or otherwise manipulate data stored or data to be stored in the database system 100);
identifying, by the server of the cloud-based storage system, a target shard for the new file based on the generated file key for the new file and the saved identified last key value for each logical shard saved in the meta-store for each physical shard (Stringham: [0057]; Upon receiving the request, the key ID mapper 206 can identify a key ID for the piece of data. For example, the key ID mapper 206 can perform a hash of an ID associated with the piece of data, or otherwise identify a key ID for the piece of data. Once the key ID for the data is identified, the key ID mapper 206 can determine the applicable cluster 106 for the data and route the request or the piece of data to the applicable cluster 106); and
routing, by the server of the cloud-based storage system, the new file to the identified target shard (Stringham: [0045]; the multi-cluster database management system 110 can be included in each node 108 of the database system 100. For example, a table or index including a mapping scheme (described in greater detail below in reference to FIG. 3A) can reside on each node 108. The nodes 108 can use the table or index to identify where to route requests or whether to respond to requests. [0057]; Upon receiving the request, the key ID mapper 206 can identify a key ID for the piece of data. For example, the key ID mapper 206 can perform a hash of an ID associated with the piece of data, or otherwise identify a key ID for the piece of data. Once the key ID for the data is identified, the key ID mapper 206 can determine the applicable cluster 106 for the data and route the request or the piece of data to the applicable cluster 106).
Regarding claim 3, the modification of Shetty, Stringham, and ISHERWOOD teaches claimed invention substantially as claimed, and Shetty further comprising processing a query from a user, wherein processing the query comprises: receiving, by the server of the cloud-based storage system, an enterprise ID associated with the user (Shetty: [0096]; Client ID field 404 is the key field for table 400A and uniquely identifies one of clients 112(1-b) or local cloud 104. Client information field 408 includes information (e.g., name, address, etc.) associated with entity identified by client ID field 404);
However, Shetty does not explicitly teach identifying, by the server of the cloud-based storage system, one or more shards based on the received enterprise ID associated with the user and the saved identified last key value for each logical shard saved in the meta-store for each physical shard; and routing, by the server of the cloud-based storage system, the query to the identified one or more shards.
However, Stringham further teaches identifying, by the server of the cloud-based storage system, one or more shards based on the received enterprise ID associated with the user and the saved identified last key value for each logical shard saved in the meta-store for each physical shard (Stringham: [0051]-[0052]; Once the cluster partition manager 202 selects the number of cluster partitions, and assigns the cluster partitions to clusters 106, the key space allocator 204 can assign identifiers, values, or ranges of a first key space (i.e., cluster key space) to each of the cluster partitions to help ensure even distribution of data among the clusters 106. For example, in one embodiment the key can comprise a hash of a document ID, a user ID, the key portion of a key/value pair, or another data identifier associated with a piece of data, or a combination of multiple identifiers. [0056]; For example, the multi-cluster database management system 110 can receive a request to process a piece of data. The request to process a piece of data can comprise a request to write data to the database system 100, read data from the database system 100, update data in the database system 100, increment data values within the database management system 100, access data from the database system 100, or otherwise manipulate data stored or data to be stored in the database system 100); and
routing, by the server of the cloud-based storage system, the query to the identified one or more shards (Stringham: [0045]; the multi-cluster database management system 110 can be included in each node 108 of the database system 100. For example, a table or index including a mapping scheme (described in greater detail below in reference to FIG. 3A) can reside on each node 108. The nodes 108 can use the table or index to identify where to route requests or whether to respond to requests. [0057]; Upon receiving the request, the key ID mapper 206 can identify a key ID for the piece of data. For example, the key ID mapper 206 can perform a hash of an ID associated with the piece of data, or otherwise identify a key ID for the piece of data. Once the key ID for the data is identified, the key ID mapper 206 can determine the applicable cluster 106 for the data and route the request or the piece of data to the applicable cluster 106).
Regarding claim 8, Shetty teaches a server of a cloud-based storage system, the server comprising: a processor (Shetty: [0218]; Cloud application server 308 includes one or more processing unit(s) (PU) 1602); and
a memory coupled with and readable by the processor and storing therein a set of instructions which (Shetty: [0218]; Cloud application server 308 includes one or more processing unit(s) (PU) 1602, non-volatile memory 1604),
when executed by the processor, causes the processor to distribute data across a plurality of storage shards by: generating a file key for each file of a plurality of files stored in a plurality of physical shards (Shetty: [0096]; Client ID field 404 is the key field for table 400A and uniquely identifies one of clients 112(1-b) or local cloud 104. Shard information field 406 provides shard information associated with a shard of tables 400B-400G, as will be described below. Shard information field 406 contains information sufficient to identify and access the particular shard of tables 400B-400G associated with the entity identified by client ID field 404. [0122]; First field 602 of UUID 600 includes a shard identifier (e.g., an alpha-numeric key, etc.) associated with one of shard records 530(1-k), which in turn identifies one of object-filer map shards 502(1-k)),
each physical shard maintained by a node of a plurality of nodes in one or more clusters (Shetty: [0085]; Filers 222(1-n) are storage nodes for the digital objects stored in cloud 102. Each filer 222(1-n) is very generic and includes at least one instance of a storage node service that communicates with private network 302 and facilitates storing, retrieving, and deleting objects in an associated mass data store 322);
mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards (Shetty: [0108]; Object-filer map table 500A and deleted object-filer map table 500B store the logical to physical (object ID to filer 222) object map. Each shard 502(1-k) can reside in any of object databases 312(1-g) and in any physical host, and there can be more than one shard 502(1-k) in each of object databases 312(1-g));
Shetty does not explicitly teach sorting the generated file keys for each file of the plurality of files into an ordered list; logically partitioning the ordered list into a plurality of logical shards; identifying a last key value for each logical shard in the partitioned ordered list; saving the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped;
However, Stringham teaches sorting the generated file keys for each file of the plurality of files into an ordered list (Stringham: [0051]; Once the cluster partition manager 202 selects the number of cluster partitions, and assigns the cluster partitions to clusters 106, the key space allocator 204 can assign identifiers, values, or ranges of a first key space (i.e., cluster key space) to each of the cluster partitions to help ensure even distribution of data among the clusters 106. [0054]; For example, in alternative embodiments the key can comprise a first letter of a user name associated with each piece of data. In this case, the cluster key space can comprise the letters of the alphabet. The key space allocator 204 can divide ranges of the letters of the alphabet between the cluster partitions of the database system 100);
logically partitioning the ordered list into a plurality of logical shards (Stringham: [0044]; The multi-cluster database management system 110 can shard or distribute data among the clusters 106 and the nodes 108 of the database system 100. [0054]; For example, in alternative embodiments the key can comprise a first letter of a user name associated with each piece of data. In this case, the cluster key space can comprise the letters of the alphabet. The key space allocator 204 can divide ranges of the letters of the alphabet between the cluster partitions of the database system 100. [0062]; For example, the key space allocator 204 can directly assign a value/identifier or range of the node key space to the nodes 108 using a key range scheme or a shard key scheme);
identifying a last key value for each logical shard in the partitioned ordered list (Stringham: [0051]; Once the cluster partition manager 202 selects the number of cluster partitions, and assigns the cluster partitions to clusters 106, the key space allocator 204 can assign identifiers, values, or ranges of a first key space (i.e., cluster key space) to each of the cluster partitions to help ensure even distribution of data among the clusters 106. [0054]; For example, in alternative embodiments the key can comprise a first letter of a user name associated with each piece of data. In this case, the cluster key space can comprise the letters of the alphabet. The key space allocator 204 can divide ranges of the letters of the alphabet between the cluster partitions of the database system 100. [0108]; In further embodiments, act 508 can involve identifying a first or last letter or number of the associated identifier);
saving the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped (Stringham: [0051]; Moreover, the cluster key space can vary depending upon the type of data being stored and the number of clusters 106 in the database system 100. The cluster key space can be based on a key used for the data stored in the database system 100. [0054]-[0055]; The key space allocator 204 can divide ranges of the letters of the alphabet between the cluster partitions of the database system 100. In particular, if there are 13 cluster partitions, the key space allocator 204 can assign each cluster partition two letters of the cluster key space (i.e., a and b assigned to cluster partition 1 and so forth). Thus, if the database system 100 includes 10 cluster partitions, the key space allocator 204 can assign 0 and a-c to cluster partition 1, 1 and d-f to cluster partition 2, and so forth. The cluster partition manager 202 can in turn assign the 10 cluster partitions to the clusters 106 of the database system 100. [0057]; Once the key ID for the data is identified, the key ID mapper 206 can determine the applicable cluster 106 for the data and route the request or the piece of data to the applicable cluster 106. For example, the key ID mapper 206 can determine to which cluster partition the key ID corresponds, and in turn to which cluster 106 the determined cluster partition is assigned);
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify Shetty (teaches a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters mapping, by the server of the cloud-based storage system, each logical shard of the plurality of logical shards to one of the plurality of physical shards) with the teachings of Stringham (teaches logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards to identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list and saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped). One of ordinary skill in the art would have been motivated to make such a combination of providing better results in dynamically increasing the cluster partition in an efficient manner by setting a new number to allow efficient redistribution (See Stringham [0049]). In addition, the references (Shetty and Stringham) teach features that are directed to analogous art and they are directed to the same field of endeavor as Shetty and Stringham are directed to mapping and storing partition information based on a key value.
The modification of Shetty and Stringham teaches claimed invention substantially as claimed, however the modification of Shetty and Stringham does not explicitly teach partially re-indexing the plurality of shards of the one or more clusters based on addition of a node to the plurality of node in the one or more clusters, wherein partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards.
However, ISHERWOOD teaches partially re-indexing the plurality of shards of the one or more clusters based on addition of a node to the plurality of node in the one or more clusters (ISHERWOOD: [0004]-[0005]; This technique is used to manage a database search index to achieve high availability and allow for node expansion/reduction without requiring the regeneration of the whole index. 1. When a node is added to the cluster, there are no new regions added to the cluster. Instead, the region indices are redistributed amongst all the nodes in the cluster to help distribute the work load. [0082]; On node addition, no new regions/shards will be added to the cluster/system, but the node addition remapping module will redistribute the shard cores throughout the cluster to distribute the work load), wherein
partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard (ISHERWOOD: [0091]-[0092]; In the shard distribution and remapping processes described above, the indexing module 342, node addition remapping module, node removal remapping module, and core recovery remapping module decide where to place/move shard cores. When placing/moving shard cores, the determination of which volume to use is based on both shard core or shard distribution and on load calculation. Volume load is determined typically by a “best match” or “best fit” procedure. This ratio heavily favors volumes which have no shard cores or which have lots of free space for shard cores. The volumes are sorted by ratio (high to low) and are traversed until the module finds the best possible match (e.g., most space),
redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed (ISHERWOOD: [0082]; On node addition, no new regions/shards will be added to the cluster/system, but the node addition remapping module will redistribute the shard cores throughout the cluster to distribute the work load. In step 808, it regenerates the region mapping table using the content index hash algorithm and new locations of the shard cores. [0091]-[0092]; When placing/moving shard cores, the determination of which volume to use is based on both shard core or shard distribution and on load calculation. This ratio heavily favors volumes which have no shard cores or which have lots of free space for shard cores. The volumes are sorted by ratio (high to low) and are traversed until the module finds the best possible match (e.g., most space), and
re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards (ISHERWOOD: [0008]; If necessary, as when a node is added or removed/regenerated, the hash algorithm is re-evaluated to facilitate the new region distribution. With the new hash algorithm, a new region mapping table is regenerated and distributed to all nodes in the cluster).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify Shetty (teaches a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters mapping, by the server of the cloud-based storage system, each logical shard of the plurality of logical shards to one of the plurality of physical shards) with the teachings of Stringham (teaches logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards to identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list and saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped) with the further teachings of ISHERWOOD (teaches partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards). One of ordinary skill in the art would have been motivated to make such a combination of providing better results in making query indexing more efficient to prevent impact (See ISHERWOOD [0079]). In addition, the references (Shetty, Stringham, and ISHERWOOD) teach features that are directed to analogous art and they are directed to the same field of endeavor as Shetty, Stringham, and ISHERWOOD are directed to mapping and storing partition information based on replicating data.
Regarding claim 9, the modification of Shetty, Stringham, and ISHERWOOD teaches claimed invention substantially as claimed, and Stringham further teaches the instructions further cause the processor to index a new file in the plurality of physical shards by: receiving the new file (Stringham: [0045]; the multi-cluster database management system 110 can be included in each node 108 of the database system 100. For example, a table or index including a mapping scheme (described in greater detail below in reference to FIG. 3A) can reside on each node 108. The nodes 108 can use the table or index to identify where to route requests or whether to respond to requests);
generating a file key for the new file (Stringham: [0051]-[0052]; Once the cluster partition manager 202 selects the number of cluster partitions, and assigns the cluster partitions to clusters 106, the key space allocator 204 can assign identifiers, values, or ranges of a first key space (i.e., cluster key space) to each of the cluster partitions to help ensure even distribution of data among the clusters 106. For example, in one embodiment the key can comprise a hash of a document ID, a user ID, the key portion of a key/value pair, or another data identifier associated with a piece of data, or a combination of multiple identifiers. [0056]; For example, the multi-cluster database management system 110 can receive a request to process a piece of data. The request to process a piece of data can comprise a request to write data to the database system 100, read data from the database system 100, update data in the database system 100, increment data values within the database management system 100, access data from the database system 100, or otherwise manipulate data stored or data to be stored in the database system 100);
identifying a target shard for the new file based on the generated file key for the new file and the saved identified last key value for each logical shard saved in the meta-store for each physical shard (Stringham: [0057]; Upon receiving the request, the key ID mapper 206 can identify a key ID for the piece of data. For example, the key ID mapper 206 can perform a hash of an ID associated with the piece of data, or otherwise identify a key ID for the piece of data. Once the key ID for the data is identified, the key ID mapper 206 can determine the applicable cluster 106 for the data and route the request or the piece of data to the applicable cluster 106); and
routing the new file to the identified target shard (Stringham: [0045]; the multi-cluster database management system 110 can be included in each node 108 of the database system 100. For example, a table or index including a mapping scheme (described in greater detail below in reference to FIG. 3A) can reside on each node 108. The nodes 108 can use the table or index to identify where to route requests or whether to respond to requests. [0057]; Upon receiving the request, the key ID mapper 206 can identify a key ID for the piece of data. For example, the key ID mapper 206 can perform a hash of an ID associated with the piece of data, or otherwise identify a key ID for the piece of data. Once the key ID for the data is identified, the key ID mapper 206 can determine the applicable cluster 106 for the data and route the request or the piece of data to the applicable cluster 106).
Regarding claim 10, the modification of Shetty, Stringham, and ISHERWOOD teaches claimed invention substantially as claimed, and Shetty further teaches the instructions further cause the processor to process a query from a user by: receiving an enterprise ID associated with the user (Shetty: [0096]; Client ID field 404 is the key field for table 400A and uniquely identifies one of clients 112(1-b) or local cloud 104. Client information field 408 includes information (e.g., name, address, etc.) associated with entity identified by client ID field 404);
Shetty does not explicitly teach identifying one or more shards based on the received enterprise ID associated with the user and the saved identified last key value for each logical shard saved in the meta-store for each physical shard; and routing the query to the identified one or more shards.
However, Stringham teaches identifying one or more shards based on the received enterprise ID associated with the user and the saved identified last key value for each logical shard saved in the meta-store for each physical shard (Stringham: [0057]; Upon receiving the request, the key ID mapper 206 can identify a key ID for the piece of data. For example, the key ID mapper 206 can perform a hash of an ID associated with the piece of data, or otherwise identify a key ID for the piece of data. Once the key ID for the data is identified, the key ID mapper 206 can determine the applicable cluster 106 for the data and route the request or the piece of data to the applicable cluster 106); and
routing the query to the identified one or more shards (Stringham: [0045]; the multi-cluster database management system 110 can be included in each node 108 of the database system 100. For example, a table or index including a mapping scheme (described in greater detail below in reference to FIG. 3A) can reside on each node 108. The nodes 108 can use the table or index to identify where to route requests or whether to respond to requests. [0057]; Upon receiving the request, the key ID mapper 206 can identify a key ID for the piece of data. For example, the key ID mapper 206 can perform a hash of an ID associated with the piece of data, or otherwise identify a key ID for the piece of data. Once the key ID for the data is identified, the key ID mapper 206 can determine the applicable cluster 106 for the data and route the request or the piece of data to the applicable cluster 106).
Regarding claim 15, Shetty teaches a non-transitory, computer-readable medium comprising a set of instructions stored therein which (Shetty: [0024];
Non-transitory, electronically-readable storage medium having code embodied therein for causing an electronic device to perform the methods of the invention are also described), when executed by a processor (Shetty: [0063]; Processing units(s) 204 impart functionality to cloud 102 by executing code stored in any or all of non-volatile memory 214),
causes the processor to distribute data across a plurality of storage shards by: generating a file key for each file of a plurality of files stored in a plurality of physical shards (Shetty: [0096]; Client ID field 404 is the key field for table 400A and uniquely identifies one of clients 112(1-b) or local cloud 104. Shard information field 406 provides shard information associated with a shard of tables 400B-400G, as will be described below. Shard information field 406 contains information sufficient to identify and access the particular shard of tables 400B-400G associated with the entity identified by client ID field 404. [0122]; First field 602 of UUID 600 includes a shard identifier (e.g., an alpha-numeric key, etc.) associated with one of shard records 530(1-k), which in turn identifies one of object-filer map shards 502(1-k)),
each physical shard maintained by a node of a plurality of nodes in one or more clusters (Shetty: [0085]; Filers 222(1-n) are storage nodes for the digital objects stored in cloud 102. Each filer 222(1-n) is very generic and includes at least one instance of a storage node service that communicates with private network 302 and facilitates storing, retrieving, and deleting objects in an associated mass data store 322);
mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards (Shetty: [0108]; Object-filer map table 500A and deleted object-filer map table 500B store the logical to physical (object ID to filer 222) object map. Each shard 502(1-k) can reside in any of object databases 312(1-g) and in any physical host, and there can be more than one shard 502(1-k) in each of object databases 312(1-g));
Shetty does not explicitly teach sorting the generated file keys for each file of the plurality of files into an ordered list; logically partitioning the ordered list into a plurality of logical shards; identifying a last key value for each logical shard in the partitioned ordered list; saving the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped;
However, Stringham teaches sorting the generated file keys for each file of the plurality of files into an ordered list (Stringham: [0051]; Once the cluster partition manager 202 selects the number of cluster partitions, and assigns the cluster partitions to clusters 106, the key space allocator 204 can assign identifiers, values, or ranges of a first key space (i.e., cluster key space) to each of the cluster partitions to help ensure even distribution of data among the clusters 106. [0054]; For example, in alternative embodiments the key can comprise a first letter of a user name associated with each piece of data. In this case, the cluster key space can comprise the letters of the alphabet. The key space allocator 204 can divide ranges of the letters of the alphabet between the cluster partitions of the database system 100);
logically partitioning the ordered list into a plurality of logical shards (Stringham: [0044]; The multi-cluster database management system 110 can shard or distribute data among the clusters 106 and the nodes 108 of the database system 100. [0054]; For example, in alternative embodiments the key can comprise a first letter of a user name associated with each piece of data. In this case, the cluster key space can comprise the letters of the alphabet. The key space allocator 204 can divide ranges of the letters of the alphabet between the cluster partitions of the database system 100. [0062]; For example, the key space allocator 204 can directly assign a value/identifier or range of the node key space to the nodes 108 using a key range scheme or a shard key scheme);
identifying a last key value for each logical shard in the partitioned ordered list (Stringham: [0051]; Once the cluster partition manager 202 selects the number of cluster partitions, and assigns the cluster partitions to clusters 106, the key space allocator 204 can assign identifiers, values, or ranges of a first key space (i.e., cluster key space) to each of the cluster partitions to help ensure even distribution of data among the clusters 106. [0054]; For example, in alternative embodiments the key can comprise a first letter of a user name associated with each piece of data. In this case, the cluster key space can comprise the letters of the alphabet. The key space allocator 204 can divide ranges of the letters of the alphabet between the cluster partitions of the database system 100. [0108]; In further embodiments, act 508 can involve identifying a first or last letter or number of the associated identifier);
saving the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped (Stringham: [0051]; Moreover, the cluster key space can vary depending upon the type of data being stored and the number of clusters 106 in the database system 100. The cluster key space can be based on a key used for the data stored in the database system 100. [0054]-[0055]; The key space allocator 204 can divide ranges of the letters of the alphabet between the cluster partitions of the database system 100. In particular, if there are 13 cluster partitions, the key space allocator 204 can assign each cluster partition two letters of the cluster key space (i.e., a and b assigned to cluster partition 1 and so forth). Thus, if the database system 100 includes 10 cluster partitions, the key space allocator 204 can assign 0 and a-c to cluster partition 1, 1 and d-f to cluster partition 2, and so forth. The cluster partition manager 202 can in turn assign the 10 cluster partitions to the clusters 106 of the database system 100. [0057]; Once the key ID for the data is identified, the key ID mapper 206 can determine the applicable cluster 106 for the data and route the request or the piece of data to the applicable cluster 106. For example, the key ID mapper 206 can determine to which cluster partition the key ID corresponds, and in turn to which cluster 106 the determined cluster partition is assigned);
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify Shetty (teaches a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters mapping, by the server of the cloud-based storage system, each logical shard of the plurality of logical shards to one of the plurality of physical shards) with the teachings of Stringham (teaches logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards to identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list and saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped). One of ordinary skill in the art would have been motivated to make such a combination of providing better results in dynamically increasing the cluster partition in an efficient manner by setting a new number to allow efficient redistribution (See Stringham [0049]). In addition, the references (Shetty and Stringham) teach features that are directed to analogous art and they are directed to the same field of endeavor as Shetty and Stringham are directed to mapping and storing partition information based on a key value.
The modification of Shetty and Stringham teaches claimed invention substantially as claimed, however the modification of Shetty and Stringham does not explicitly teach partially re-indexing the plurality of shards of the one or more clusters based on addition of a node to the plurality of node in the one or more clusters, wherein partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards.
However, ISHERWOOD teaches partially re-indexing the plurality of shards of the one or more clusters based on addition of a node to the plurality of node in the one or more clusters (ISHERWOOD: [0004]-[0005]; This technique is used to manage a database search index to achieve high availability and allow for node expansion/reduction without requiring the regeneration of the whole index. 1. When a node is added to the cluster, there are no new regions added to the cluster. Instead, the region indices are redistributed amongst all the nodes in the cluster to help distribute the work load. [0082]; On node addition, no new regions/shards will be added to the cluster/system, but the node addition remapping module will redistribute the shard cores throughout the cluster to distribute the work load), wherein
partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard (ISHERWOOD: [0091]-[0092]; In the shard distribution and remapping processes described above, the indexing module 342, node addition remapping module, node removal remapping module, and core recovery remapping module decide where to place/move shard cores. When placing/moving shard cores, the determination of which volume to use is based on both shard core or shard distribution and on load calculation. Volume load is determined typically by a “best match” or “best fit” procedure. This ratio heavily favors volumes which have no shard cores or which have lots of free space for shard cores. The volumes are sorted by ratio (high to low) and are traversed until the module finds the best possible match (e.g., most space),
redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed (ISHERWOOD: [0082]; On node addition, no new regions/shards will be added to the cluster/system, but the node addition remapping module will redistribute the shard cores throughout the cluster to distribute the work load. In step 808, it regenerates the region mapping table using the content index hash algorithm and new locations of the shard cores. [0091]-[0092]; When placing/moving shard cores, the determination of which volume to use is based on both shard core or shard distribution and on load calculation. This ratio heavily favors volumes which have no shard cores or which have lots of free space for shard cores. The volumes are sorted by ratio (high to low) and are traversed until the module finds the best possible match (e.g., most space), and
re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards (ISHERWOOD: [0008]; If necessary, as when a node is added or removed/regenerated, the hash algorithm is re-evaluated to facilitate the new region distribution. With the new hash algorithm, a new region mapping table is regenerated and distributed to all nodes in the cluster).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify Shetty (teaches a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters mapping, by the server of the cloud-based storage system, each logical shard of the plurality of logical shards to one of the plurality of physical shards) with the teachings of Stringham (teaches logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards to identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list and saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped) with the further teachings of ISHERWOOD (teaches partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards). One of ordinary skill in the art would have been motivated to make such a combination of providing better results in making query indexing more efficient to prevent impact (See ISHERWOOD [0079]). In addition, the references (Shetty, Stringham, and ISHERWOOD) teach features that are directed to analogous art and they are directed to the same field of endeavor as Shetty, Stringham, and ISHERWOOD are directed to mapping and storing partition information based on replicating data.
Regarding claim 16, the modification of Shetty, Stringham, and ISHERWOOD teaches claimed invention substantially as claimed, and Stringham further teaches the instructions further cause the processor to index a new file in the plurality of physical shards by: receiving the new file (Stringham: [0045]; the multi-cluster database management system 110 can be included in each node 108 of the database system 100. For example, a table or index including a mapping scheme (described in greater detail below in reference to FIG. 3A) can reside on each node 108. The nodes 108 can use the table or index to identify where to route requests or whether to respond to requests);
generating a file key for the new file (Stringham: [0051]-[0052]; Once the cluster partition manager 202 selects the number of cluster partitions, and assigns the cluster partitions to clusters 106, the key space allocator 204 can assign identifiers, values, or ranges of a first key space (i.e., cluster key space) to each of the cluster partitions to help ensure even distribution of data among the clusters 106. For example, in one embodiment the key can comprise a hash of a document ID, a user ID, the key portion of a key/value pair, or another data identifier associated with a piece of data, or a combination of multiple identifiers. [0056]; For example, the multi-cluster database management system 110 can receive a request to process a piece of data. The request to process a piece of data can comprise a request to write data to the database system 100, read data from the database system 100, update data in the database system 100, increment data values within the database management system 100, access data from the database system 100, or otherwise manipulate data stored or data to be stored in the database system 100);
identifying a target shard for the new file based on the generated file key for the new file and the saved identified last key value for each logical shard saved in the meta-store for each physical shard (Stringham: [0057]; Upon receiving the request, the key ID mapper 206 can identify a key ID for the piece of data. For example, the key ID mapper 206 can perform a hash of an ID associated with the piece of data, or otherwise identify a key ID for the piece of data. Once the key ID for the data is identified, the key ID mapper 206 can determine the applicable cluster 106 for the data and route the request or the piece of data to the applicable cluster 106); and
routing the new file to the identified target shard (Stringham: [0045]; the multi-cluster database management system 110 can be included in each node 108 of the database system 100. For example, a table or index including a mapping scheme (described in greater detail below in reference to FIG. 3A) can reside on each node 108. The nodes 108 can use the table or index to identify where to route requests or whether to respond to requests. [0057]; Upon receiving the request, the key ID mapper 206 can identify a key ID for the piece of data. For example, the key ID mapper 206 can perform a hash of an ID associated with the piece of data, or otherwise identify a key ID for the piece of data. Once the key ID for the data is identified, the key ID mapper 206 can determine the applicable cluster 106 for the data and route the request or the piece of data to the applicable cluster 106).
Regarding claim 17, the modification of Shetty, Stringham, and ISHERWOOD teaches claimed invention substantially as claimed, and Shetty further teaches the instructions further cause the processor to process a query from a user by: receiving an enterprise ID associated with the user (Shetty: [0096]; Client ID field 404 is the key field for table 400A and uniquely identifies one of clients 112(1-b) or local cloud 104. Client information field 408 includes information (e.g., name, address, etc.) associated with entity identified by client ID field 404);
Shetty does not explicitly teach identifying one or more shards based on the received enterprise ID associated with the user and the saved identified last key value for each logical shard saved in the meta-store for each physical shard; and routing the query to the identified one or more shards.
However, Stringham further teaches identifying one or more shards based on the received enterprise ID associated with the user and the saved identified last key value for each logical shard saved in the meta-store for each physical shard (Stringham: [0057]; Upon receiving the request, the key ID mapper 206 can identify a key ID for the piece of data. For example, the key ID mapper 206 can perform a hash of an ID associated with the piece of data, or otherwise identify a key ID for the piece of data. Once the key ID for the data is identified, the key ID mapper 206 can determine the applicable cluster 106 for the data and route the request or the piece of data to the applicable cluster 106); and
routing the query to the identified one or more shards (Stringham: [0045]; the multi-cluster database management system 110 can be included in each node 108 of the database system 100. For example, a table or index including a mapping scheme (described in greater detail below in reference to FIG. 3A) can reside on each node 108. The nodes 108 can use the table or index to identify where to route requests or whether to respond to requests. [0057]; Upon receiving the request, the key ID mapper 206 can identify a key ID for the piece of data. For example, the key ID mapper 206 can perform a hash of an ID associated with the piece of data, or otherwise identify a key ID for the piece of data. Once the key ID for the data is identified, the key ID mapper 206 can determine the applicable cluster 106 for the data and route the request or the piece of data to the applicable cluster 106).
Claims 4-6, 11-13, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over U.S Patent Application Publication 2014/0149794 issued to Shetty et al. (hereinafter as "Shetty") in view of U.S Patent Application Publication 2015/0254325 issued to Russell R. Stringham (hereinafter as “Stringham”) in view of U.S Patent Application Publication 2014/0330785 issued to Isherwood et al. (hereinafter as “Isherwood”) in further view of U.S Patent Application Publication p issued to SKJOLSVOLD (hereinafter as “SKJOLSVOLD”).
Regarding claim 4, the modification of Shetty, Stringham, and Isherwood teaches claimed invention substantially as claimed, however the modification of Shetty, Stringham, and Isherwood does not explicitly teach mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards further comprises: sorting, by the server of the cloud-based storage system, the plurality of logical shards in descending order based on a load of each logical shard; assigning, by the server of the cloud-based storage system, one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards; sorting, by the server of the cloud-based storage system, remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard; assigning, by the server of the cloud-based storage system, one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards; determining, by the server of the cloud-based storage system, whether all logical shards have been assigned to a physical shard; sorting of remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard, and assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards until all logical shards have been assigned to a physical shard.
SKJOLSVOLD teaches mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards further comprises:
sorting, by the server of the cloud-based storage system, the plurality of logical shards in descending order based on a load of each logical shard (SKJOLSVOLD: [0231]; the search orders of at least one of the partitions being assigned and the servers to which the partitions are assigned are determined by ordering those partitions or servers based on corresponding dimensional values, for example in a list. For example, a search order can comprise sorting partitions 712 in ascending or descending order of corresponding partition values by one or multiple dimensions);
assigning, by the server of the cloud-based storage system, one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards (SKJOLSVOLD :[0231]-[0232]; In some implementations, the search orders of at least one of the partitions being assigned and the servers to which the partitions are assigned are determined by ordering those partitions or servers based on corresponding dimensional values, for example in a list. The partitions and/or servers may be sorted by the same dimension, multiple dimensions, or all dimensions. For example, partitions 712 may be sorted in ascending or descending order of partition values that all correspond to dimension 710 a);
sorting, by the server of the cloud-based storage system, remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard (SKJOLSVOLD: [0069]; The propose candidates function can be configured to identify, select, and/or provide a plurality of candidate operations, where the plurality of candidate operations are potential load balancing operations associated with partitions assigned to servers of the distributed system. [0189]; The propose candidates function can select the plurality of candidate operations based on server load of the servers of the scalable storage. For example, the propose candidates function can sort each server of the scalable storage from busiest to idlest as quantified by a load metric, such as the server load metric described above. Thus, servers can be added to the candidate target server set based on server load, for example, based on having low server load);
assigning, by the server of the cloud-based storage system, one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards (SKJOLSVOLD: [0231]-[0232]; In some implementations, the search orders of at least one of the partitions being assigned and the servers to which the partitions are assigned are determined by ordering those partitions or servers based on corresponding dimensional values, for example in a list. The partitions and/or servers may be sorted by the same dimension, multiple dimensions, or all dimensions. For example, partitions 712 may be sorted in ascending or descending order of partition values that all correspond to dimension 710 a
[0241]; Examples of such cases include where at least one partition to be assigned is presently unassigned to a server, or otherwise should be urgently assigned to a server. For example, any of the at least one partitions to be assigned may presently be assigned to or were previously assigned to one or more servers of the scalable storage that have failed, been shut down, or are otherwise unavailable for hosting);
determining, by the server of the cloud-based storage system, whether all logical shards have been assigned to a physical shard (SKJOLSVOLD: [0265]; In some implementations, the determination comprises dividing the servers into sets. One set can correspond to servers that have partitions that are to be offloaded in the assignment plan (also referred to as first server set). Another set can correspond to servers to which those partitions are to be assigned (also referred to as second server set)); and
in response to determining not all logical shards have been assigned to a physical shard, repeating, by the server of the cloud-based storage system, said sorting of the plurality of logical shards in descending order based on a load of each logical shard, assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards (SKJOLSVOLD: [0233]; In some implementations, at least one candidate server is selected in an iteration of the searching, and in another iteration of the searching the at least one candidate server is updated to at least another server. The update may be based on a new candidate server having a greater available capacity or lower utilization in one or more dimensions. For example, servers 714 may be searched for assignment in a binary search or other type of search where each iteration of the search examines a different server or servers of servers 714 for the assignment. The at least one candidate server may be selected in one iteration and replaced by another server in a subsequent iteration. A new candidate server may have more available capacity, less utilization, or otherwise be more suitable for the partition or partitions being assigned. [0247]; The selection can be based on, for example, any combination of the number of partitions that are subject to load balancing, the size of those partitions, whether those partitions are presently assigned to a server or are unassigned, the number of servers available for assignment, the available capacity of those servers, or other factors),
sorting of remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard, and assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards until all logical shards have been assigned to a physical shard (SKJOLSVOLD: [0233]; In some implementations, at least one candidate server is selected in an iteration of the searching, and in another iteration of the searching the at least one candidate server is updated to at least another server. The update may be based on a new candidate server having a greater available capacity or lower utilization in one or more dimensions. For example, servers 714 may be searched for assignment in a binary search or other type of search where each iteration of the search examines a different server or servers of servers 714 for the assignment. The at least one candidate server may be selected in one iteration and replaced by another server in a subsequent iteration. A new candidate server may have more available capacity, less utilization, or otherwise be more suitable for the partition or partitions being assigned. [0247]; The selection can be based on, for example, any combination of the number of partitions that are subject to load balancing, the size of those partitions, whether those partitions are presently assigned to a server or are unassigned, the number of servers available for assignment, the available capacity of those servers, or other factors).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify Shetty (teaches a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters, wherein the file key comprises a hash of an enterprise identifier (enterprise ID) for an entity to which the creator of the file is a member, a hash of a folder identifier (folder ID) for a location in which the file is stored, and a hash of a file identifier (file ID) uniquely identifying the file) with the teachings of Stringham (teaches logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards to identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list and saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped) with the teachings of ISHERWOOD (teaches partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards) to further include the teachings of SKJOLSVOLD (teaches spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards…federating, by the server of the cloud-based storage system, the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage system). One of ordinary skill in the art would have been motivated to make such a combination of providing better results in improving the load balancing with respect to the CPU utilization (See SKJOLSVOLD: [0039]). In addition, the references (Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD) teach features that are directed to analogous art and they are directed to the same field of endeavor as Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD are directed to mapping and storing partition information based on replicating data.
Regarding claim 5, the modification of Shetty, Stringham, and Isherwood teaches claimed invention substantially as claimed, however the modification of Shetty, Stringham, and Isherwood does not explicitly teach further comprising spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards, wherein spilling over the source shard to the target shard comprises: identifying, by the server of the cloud-based storage system, the source shard based on available storage capacity of each physical shard of the plurality of physical shards; identifying, by the server of the cloud-based storage system, the target shard based on the available storage capacity of each physical shard of the plurality of physical shards; spilling, by the server of the cloud-based storage system, content from the source shard to the target shard; updating, by the server of the cloud-based storage system, an override map of the meta- store for the source shard and the target shard; and federating, by the server of the cloud-based storage system, the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud- based storage system.
However, SKJOLSVOLD teaches further comprising spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards, wherein spilling over the source shard to the target shard comprises: identifying, by the server of the cloud-based storage system, the source shard based on available storage capacity of each physical shard of the plurality of physical shards (SKJOLSVOLD: [0220]; Utilizing any to all of the assignment heuristics, assignment generator 704 may determine an assignment of a partition to a server based on the server having available capacity to host the partition with respect to one to all of the dimensions);
identifying, by the server of the cloud-based storage system, the target shard based on the available storage capacity of each physical shard of the plurality of physical shards (SKJOLSVOLD: [0224]; In bin packing, any given server bin can have a size or volume that corresponds to capacity (e.g. a cap value) and an occupied volume that corresponds to present utilization (e.g. a server value or a sum of partition values of the server). [0238]-[0239]; The dimensional values can be selected from partition values and server values of the dimensions. For example, the analysis may be based on dimensional values that would correspond to the servers if the assignment plans were to be executed. The selection can also be based on whether or not a server in the assignment plan is exceeding a cap value or how many servers or cap values are being exceeded in the assignment plan);
spilling, by the server of the cloud-based storage system, content from the source shard to the target shard (SKJOLSVOLD: [0154]; Furthermore, the one or more servers can be identified based on server load, such that the plurality of candidate operations is selected based on server load of the one or more servers. For example, the propose candidates function can sort each server of the scalable storage from busiest to idlest as quantified by a load metric, such as a server load metric);
updating, by the server of the cloud-based storage system, an override map of the meta- store for the source shard and the target shard (SKJOLSVOLD: [0040]; It is noted that movement of a partition does not require physical movement of data at the storage level. Rather, in various implementations, a partition can be moved to another server by reassigning or remapping a partition range assignment of the partition to a new server (e.g. in a partition map)); and
federating, by the server of the cloud-based storage system, the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud- based storage system (SKJOLSVOLD: [0040]; It is noted that movement of a partition does not require physical movement of data at the storage level. Rather, in various implementations, a partition can be moved to another server by reassigning or remapping a partition range assignment of the partition to a new server (e.g. in a partition map)).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify Shetty (teaches a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters, wherein the file key comprises a hash of an enterprise identifier (enterprise ID) for an entity to which the creator of the file is a member, a hash of a folder identifier (folder ID) for a location in which the file is stored, and a hash of a file identifier (file ID) uniquely identifying the file) with the teachings of Stringham (teaches logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards to identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list and saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped) with the teachings of ISHERWOOD (teaches partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards) to further include the teachings of SKJOLSVOLD (teaches spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards…federating, by the server of the cloud-based storage system, the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage system). One of ordinary skill in the art would have been motivated to make such a combination of providing better results in improving the load balancing with respect to the CPU utilization (See SKJOLSVOLD: [0039]). In addition, the references (Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD) teach features that are directed to analogous art and they are directed to the same field of endeavor as Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD are directed to mapping and storing partition information based on replicating data.
Regarding claim 6, the modification of Shetty, Stringham, and Isherwood teaches claimed invention substantially as claimed, however the modification of Shetty, Stringham, and Isherwood does not explicitly teach further comprising spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards, wherein spilling over the source shard to the target shard comprises: identifying, by the server of the cloud-based storage system, available space for each node of the plurality of node in a cluster of the one or more clusters; selecting, by the server of the cloud-based storage system, a plurality of target shards based on an overhead capacity of each shard of the plurality of shards on a node of the plurality of node having a most available space; identifying, by the server of the cloud-based storage system, one or more source shards based on one or more overhead criteria; selecting, by the server of the cloud-based storage system, one of the identified one or more source shards; spilling, by the server of the cloud-based storage system, the selected one of the identified one or more source shards to the selected plurality of target shards; updating, by the server of the cloud-based storage system, an override map of the meta- store for the source shard and each target shard; federating, by the server of the cloud-based storage system, the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system; determining, by the server of the cloud-based storage system, whether all of the identified one or more source shards have been spilled to the plurality of target shards; and in response to determining not all of the identified one or more source shards have been spilled to the plurality of target shards, repeating, by the server of the cloud-based storage system, the selecting one of the identified one or more source shards, spilling the selected one of the identified one or more source shards to the selected plurality of target shards, updating the override map of the meta-store for the source shard and each target shard, and federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system until all of the identified one or more source shards have been spilled to the plurality of target shards.
However, SKJOLSVOLD teaches further comprising spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards, wherein spilling over the source shard to the target shard comprises: identifying, by the server of the cloud-based storage system, available space for each node of the plurality of node in a cluster of the one or more clusters (SKJOLSVOLD: [0203]; Assignment coordinator 702 can be configured to provide one or more servers 714 to which partitions 712 may be assigned. Servers 714 may comprise all servers of the scalable storage or a subset therefrom. Furthermore servers 714 may comprise all servers having available capacity in any dimension or dimensions or a given dimension or dimensions, or a subset therefrom);
selecting, by the server of the cloud-based storage system, a plurality of target shards based on an overhead capacity of each shard of the plurality of shards on a node of the plurality of node having a most available space (SKJOLSVOLD: [0189]-[0190]; The propose candidates function can select the plurality of candidate operations based on server load of the servers of the scalable storage. For example, the propose candidates function can sort each server of the scalable storage from busiest to idlest as quantified by a load metric, such as the server load metric described above. The candidate target server set may be limited to a number of those servers that have the lowest server load. Thus, servers can be added to the candidate target server set based on server load, for example, based on having low server load. Thus, a load balancing operation could comprise a given partition from the candidate partitions set being moved to a given server of the candidate target server set, for each combination of partition and server in the sets);
identifying, by the server of the cloud-based storage system, one or more source shards based on one or more overhead criteria (SKJOLSVOLD: [0200]; Assignment coordinator 702 can be configured to manage the generation of assignment plans for partitions to servers. In this respect, assignment coordinator 702 can be configured to collect and provide one or more partitions 712 to assignment generator 704 for assignment and one or more servers 714 (e.g. servers 204, 206, 208, and 210 in FIG. 2) of scalable storage (e.g. scalable storage 200 in FIG. 2) to assignment generator 704 that may receive those assignments);
selecting, by the server of the cloud-based storage system, one of the identified one or more source shards (SKJOLSVOLD: [0152]; In some implementations, selecting the plurality of candidate operations comprises identifying one or more servers that have a server metric of a dimension, or a particular dimension that exceeds a threshold value, such as the server cap. [0154]; Furthermore, the one or more servers can be identified based on server load, such that the plurality of candidate operations is selected based on server load of the one or more servers. [0156]; Furthermore, the propose candidates function may sort the servers of the scalable storage from highest to lowest server metric of the dimension. A candidate target server set may be selected as a number of those servers that have the lowest server metric of the dimension);
spilling, by the server of the cloud-based storage system, the selected one of the identified one or more source shards to the selected plurality of target shards (SKJOLSVOLD: [0154]; Furthermore, the one or more servers can be identified based on server load, such that the plurality of candidate operations is selected based on server load of the one or more servers. For example, the propose candidates function can sort each server of the scalable storage from busiest to idlest as quantified by a load metric, such as a server load metric);
updating, by the server of the cloud-based storage system, an override map of the meta- store for the source shard and each target shard (SKJOLSVOLD: [0040]; It is noted that movement of a partition does not require physical movement of data at the storage level. Rather, in various implementations, a partition can be moved to another server by reassigning or remapping a partition range assignment of the partition to a new server (e.g. in a partition map));
federating, by the server of the cloud-based storage system, the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system (SKJOLSVOLD: [0040]; It is noted that movement of a partition does not require physical movement of data at the storage level. Rather, in various implementations, a partition can be moved to another server by reassigning or remapping a partition range assignment of the partition to a new server (e.g. in a partition map));
determining, by the server of the cloud-based storage system, whether all of the identified one or more source shards have been spilled to the plurality of target shards (SKJOLSVOLD: [0040]; It is noted that movement of a partition does not require physical movement of data at the storage level. Rather, in various implementations, a partition can be moved to another server by reassigning or remapping a partition range assignment of the partition to a new server (e.g. in a partition map)); and
in response to determining not all of the identified one or more source shards have been spilled to the plurality of target shards, repeating, by the server of the cloud-based storage system, the selecting one of the identified one or more source shards, spilling the selected one of the identified one or more source shards to the selected plurality of target shards, updating the override map of the meta-store for the source shard and each target shard, and federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system until all of the identified one or more source shards have been spilled to the plurality of target shards (SKJOLSVOLD: [0233]; In some implementations, at least one candidate server is selected in an iteration of the searching, and in another iteration of the searching the at least one candidate server is updated to at least another server. The update may be based on a new candidate server having a greater available capacity or lower utilization in one or more dimensions. For example, servers 714 may be searched for assignment in a binary search or other type of search where each iteration of the search examines a different server or servers of servers 714 for the assignment. The at least one candidate server may be selected in one iteration and replaced by another server in a subsequent iteration. A new candidate server may have more available capacity, less utilization, or otherwise be more suitable for the partition or partitions being assigned. [0247]; The selection can be based on, for example, any combination of the number of partitions that are subject to load balancing, the size of those partitions, whether those partitions are presently assigned to a server or are unassigned, the number of servers available for assignment, the available capacity of those servers, or other factors).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify Shetty (teaches a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters, wherein the file key comprises a hash of an enterprise identifier (enterprise ID) for an entity to which the creator of the file is a member, a hash of a folder identifier (folder ID) for a location in which the file is stored, and a hash of a file identifier (file ID) uniquely identifying the file) with the teachings of Stringham (teaches logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards to identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list and saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped) with the teachings of ISHERWOOD (teaches partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards) to further include the teachings of SKJOLSVOLD (teaches spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards…federating, by the server of the cloud-based storage system, the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage system). One of ordinary skill in the art would have been motivated to make such a combination of providing better results in improving the load balancing with respect to the CPU utilization (See SKJOLSVOLD: [0039]). In addition, the references (Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD) teach features that are directed to analogous art and they are directed to the same field of endeavor as Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD are directed to mapping and storing partition information based on replicating data.
Regarding claim 11, the modification of Shetty, Stringham, and Isherwood teaches claimed invention substantially as claimed, however the modification of Shetty, Stringham, and Isherwood does not explicitly teach mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards further comprises: sorting the plurality of logical shards in descending order based on a load of each logical shard; assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards; sorting remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard; assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards; determining whether all logical shards have been assigned to a physical shard; and in response to determining not all logical shards have been assigned to a physical shard, repeating said sorting of the plurality of logical shards in descending order based on a load of each logical shard, assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards, sorting of remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard, and assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards until all logical shards have been assigned to a physical shard.
However, SKJOLSVOLD teaches mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards further comprises: sorting the plurality of logical shards in descending order based on a load of each logical shard (SKJOLSVOLD: [0231]; the search orders of at least one of the partitions being assigned and the servers to which the partitions are assigned are determined by ordering those partitions or servers based on corresponding dimensional values, for example in a list. For example, a search order can comprise sorting partitions 712 in ascending or descending order of corresponding partition values by one or multiple dimensions);
assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards (SKJOLSVOLD :[0231]-[0232]; In some implementations, the search orders of at least one of the partitions being assigned and the servers to which the partitions are assigned are determined by ordering those partitions or servers based on corresponding dimensional values, for example in a list. The partitions and/or servers may be sorted by the same dimension, multiple dimensions, or all dimensions. For example, partitions 712 may be sorted in ascending or descending order of partition values that all correspond to dimension 710 a);
sorting remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard (SKJOLSVOLD: [0069]; The propose candidates function can be configured to identify, select, and/or provide a plurality of candidate operations, where the plurality of candidate operations are potential load balancing operations associated with partitions assigned to servers of the distributed system. [0189]; The propose candidates function can select the plurality of candidate operations based on server load of the servers of the scalable storage. For example, the propose candidates function can sort each server of the scalable storage from busiest to idlest as quantified by a load metric, such as the server load metric described above. Thus, servers can be added to the candidate target server set based on server load, for example, based on having low server load);
assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards (SKJOLSVOLD: [0233]; In some implementations, at least one candidate server is selected in an iteration of the searching, and in another iteration of the searching the at least one candidate server is updated to at least another server. The update may be based on a new candidate server having a greater available capacity or lower utilization in one or more dimensions. For example, servers 714 may be searched for assignment in a binary search or other type of search where each iteration of the search examines a different server or servers of servers 714 for the assignment. The at least one candidate server may be selected in one iteration and replaced by another server in a subsequent iteration. A new candidate server may have more available capacity, less utilization, or otherwise be more suitable for the partition or partitions being assigned. [0247]; The selection can be based on, for example, any combination of the number of partitions that are subject to load balancing, the size of those partitions, whether those partitions are presently assigned to a server or are unassigned, the number of servers available for assignment, the available capacity of those servers, or other factors);
determining whether all logical shards have been assigned to a physical shard (SKJOLSVOLD: [0265]; In some implementations, the determination comprises dividing the servers into sets. One set can correspond to servers that have partitions that are to be offloaded in the assignment plan (also referred to as first server set). Another set can correspond to servers to which those partitions are to be assigned (also referred to as second server set)); and
in response to determining not all logical shards have been assigned to a physical shard, repeating said sorting of the plurality of logical shards in descending order based on a load of each logical shard, assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards (SKJOLSVOLD: [0233]; In some implementations, at least one candidate server is selected in an iteration of the searching, and in another iteration of the searching the at least one candidate server is updated to at least another server. The update may be based on a new candidate server having a greater available capacity or lower utilization in one or more dimensions. For example, servers 714 may be searched for assignment in a binary search or other type of search where each iteration of the search examines a different server or servers of servers 714 for the assignment. The at least one candidate server may be selected in one iteration and replaced by another server in a subsequent iteration. A new candidate server may have more available capacity, less utilization, or otherwise be more suitable for the partition or partitions being assigned. [0247]; The selection can be based on, for example, any combination of the number of partitions that are subject to load balancing, the size of those partitions, whether those partitions are presently assigned to a server or are unassigned, the number of servers available for assignment, the available capacity of those servers, or other factors),
sorting of remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard, and assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards until all logical shards have been assigned to a physical shard (SKJOLSVOLD: [0233]; In some implementations, at least one candidate server is selected in an iteration of the searching, and in another iteration of the searching the at least one candidate server is updated to at least another server. The update may be based on a new candidate server having a greater available capacity or lower utilization in one or more dimensions. For example, servers 714 may be searched for assignment in a binary search or other type of search where each iteration of the search examines a different server or servers of servers 714 for the assignment. The at least one candidate server may be selected in one iteration and replaced by another server in a subsequent iteration. A new candidate server may have more available capacity, less utilization, or otherwise be more suitable for the partition or partitions being assigned. [0247]; The selection can be based on, for example, any combination of the number of partitions that are subject to load balancing, the size of those partitions, whether those partitions are presently assigned to a server or are unassigned, the number of servers available for assignment, the available capacity of those servers, or other factors).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify Shetty (teaches a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters, wherein the file key comprises a hash of an enterprise identifier (enterprise ID) for an entity to which the creator of the file is a member, a hash of a folder identifier (folder ID) for a location in which the file is stored, and a hash of a file identifier (file ID) uniquely identifying the file) with the teachings of Stringham (teaches logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards to identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list and saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped) with the teachings of ISHERWOOD (teaches partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards) to further include the teachings of SKJOLSVOLD (teaches spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards…federating, by the server of the cloud-based storage system, the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage system). One of ordinary skill in the art would have been motivated to make such a combination of providing better results in improving the load balancing with respect to the CPU utilization (See SKJOLSVOLD: [0039]). In addition, the references (Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD) teach features that are directed to analogous art and they are directed to the same field of endeavor as Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD are directed to mapping and storing partition information based on replicating data.
Regarding claim 12, the modification of Shetty, Stringham, and Isherwood teaches claimed invention substantially as claimed, however the modification of Shetty, Stringham, and Isherwood does not explicitly teach the instructions further cause the processor to spill over a source shard of the plurality of physical shards to a target shard of the plurality of target shards by: identifying the source shard based on available storage capacity of each physical shard of the plurality of physical shards; identifying the target shard based on the available storage capacity of each physical shard of the plurality of physical shards; spilling content from the source shard to the target shard; updating an override map of the meta-store for the source shard and the target; and federating the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage.
However, SKJOLSVOLD teaches the instructions further cause the processor to spill over a source shard of the plurality of physical shards to a target shard of the plurality of target shards by: identifying the source shard based on available storage capacity of each physical shard of the plurality of physical shards (SKJOLSVOLD: [0224]; In bin packing, any given server bin can have a size or volume that corresponds to capacity (e.g. a cap value) and an occupied volume that corresponds to present utilization (e.g. a server value or a sum of partition values of the server). [0238]-[0239]; The dimensional values can be selected from partition values and server values of the dimensions. For example, the analysis may be based on dimensional values that would correspond to the servers if the assignment plans were to be executed. The selection can also be based on whether or not a server in the assignment plan is exceeding a cap value or how many servers or cap values are being exceeded in the assignment plan);
identifying the target shard based on the available storage capacity of each physical shard of the plurality of physical shards (SKJOLSVOLD: [0224]; In bin packing, any given server bin can have a size or volume that corresponds to capacity (e.g. a cap value) and an occupied volume that corresponds to present utilization (e.g. a server value or a sum of partition values of the server). [0238]-[0239]; The dimensional values can be selected from partition values and server values of the dimensions. For example, the analysis may be based on dimensional values that would correspond to the servers if the assignment plans were to be executed. The selection can also be based on whether or not a server in the assignment plan is exceeding a cap value or how many servers or cap values are being exceeded in the assignment plan);
spilling content from the source shard to the target shard (SKJOLSVOLD: [0154]; Furthermore, the one or more servers can be identified based on server load, such that the plurality of candidate operations is selected based on server load of the one or more servers. For example, the propose candidates function can sort each server of the scalable storage from busiest to idlest as quantified by a load metric, such as a server load metric);
updating an override map of the meta-store for the source shard and the target shard (SKJOLSVOLD: [0040]; It is noted that movement of a partition does not require physical movement of data at the storage level. Rather, in various implementations, a partition can be moved to another server by reassigning or remapping a partition range assignment of the partition to a new server (e.g. in a partition map)); and
federating the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage system (SKJOLSVOLD: [0040]; It is noted that movement of a partition does not require physical movement of data at the storage level. Rather, in various implementations, a partition can be moved to another server by reassigning or remapping a partition range assignment of the partition to a new server (e.g. in a partition map)).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify Shetty (teaches a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters, wherein the file key comprises a hash of an enterprise identifier (enterprise ID) for an entity to which the creator of the file is a member, a hash of a folder identifier (folder ID) for a location in which the file is stored, and a hash of a file identifier (file ID) uniquely identifying the file) with the teachings of Stringham (teaches logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards to identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list and saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped) with the teachings of ISHERWOOD (teaches partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards) to further include the teachings of SKJOLSVOLD (teaches spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards…federating, by the server of the cloud-based storage system, the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage system). One of ordinary skill in the art would have been motivated to make such a combination of providing better results in improving the load balancing with respect to the CPU utilization (See SKJOLSVOLD: [0039]). In addition, the references (Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD) teach features that are directed to analogous art and they are directed to the same field of endeavor as Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD are directed to mapping and storing partition information based on replicating data.
Regarding claim 13, the modification of Shetty, Stringham, and Isherwood teaches claimed invention substantially as claimed, however the modification of Shetty, Stringham, and Isherwood does not explicitly teach the instructions further cause the processor to spill over a source shard of the plurality of physical shards to a target shard of the plurality of target shards by: identifying available space for each node of the plurality of node in a cluster of the one or more clusters; selecting a plurality of target shards based on an overhead capacity of each shard of the plurality of shards on a node of the plurality of node having a most available space; identifying one or more source shards based on one or more overhead criteria; selecting one of the identified one or more source shards; spilling the selected one of the identified one or more source shards to the selected plurality of target shards; updating an override map of the meta-store for the source shard and each target shard; federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system; determining whether all of the identified one or more source shards have been spilled to the plurality of target shards; and in response to determining not all of the identified one or more source shards have been spilled to the plurality of target shards, repeating the selecting one of the identified one or more source shards, spilling the selected one of the identified one or more source shards to the selected plurality of target shards, updating the override map of the meta-store for the source shard and each target shard, and federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system until all of the identified one or more source shards have been spilled to the plurality of target shards.
However, SKJOLSVOLD teaches the instructions further cause the processor to spill over a source shard of the plurality of physical shards to a target shard of the plurality of target shards by: identifying available space for each node of the plurality of node in a cluster of the one or more clusters (SKJOLSVOLD: [0203]; Assignment coordinator 702 can be configured to provide one or more servers 714 to which partitions 712 may be assigned. Servers 714 may comprise all servers of the scalable storage or a subset therefrom. Furthermore servers 714 may comprise all servers having available capacity in any dimension or dimensions or a given dimension or dimensions, or a subset therefrom);
selecting a plurality of target shards based on an overhead capacity of each shard of the plurality of shards on a node of the plurality of node having a most available space (SKJOLSVOLD: [0189]-[0190]; The propose candidates function can select the plurality of candidate operations based on server load of the servers of the scalable storage. For example, the propose candidates function can sort each server of the scalable storage from busiest to idlest as quantified by a load metric, such as the server load metric described above. The candidate target server set may be limited to a number of those servers that have the lowest server load. Thus, servers can be added to the candidate target server set based on server load, for example, based on having low server load. Thus, a load balancing operation could comprise a given partition from the candidate partitions set being moved to a given server of the candidate target server set, for each combination of partition and server in the sets);
identifying one or more source shards based on one or more overhead criteria (SKJOLSVOLD: [0200]; Assignment coordinator 702 can be configured to manage the generation of assignment plans for partitions to servers. In this respect, assignment coordinator 702 can be configured to collect and provide one or more partitions 712 to assignment generator 704 for assignment and one or more servers 714 (e.g. servers 204, 206, 208, and 210 in FIG. 2) of scalable storage (e.g. scalable storage 200 in FIG. 2) to assignment generator 704 that may receive those assignments);
selecting one of the identified one or more source shards (SKJOLSVOLD: [0152]; In some implementations, selecting the plurality of candidate operations comprises identifying one or more servers that have a server metric of a dimension, or a particular dimension that exceeds a threshold value, such as the server cap. [0154]; Furthermore, the one or more servers can be identified based on server load, such that the plurality of candidate operations is selected based on server load of the one or more servers. [0156]; Furthermore, the propose candidates function may sort the servers of the scalable storage from highest to lowest server metric of the dimension. A candidate target server set may be selected as a number of those servers that have the lowest server metric of the dimension);
spilling the selected one of the identified one or more source shards to the selected plurality of target shards (SKJOLSVOLD: [0154]; Furthermore, the one or more servers can be identified based on server load, such that the plurality of candidate operations is selected based on server load of the one or more servers. For example, the propose candidates function can sort each server of the scalable storage from busiest to idlest as quantified by a load metric, such as a server load metric);
updating an override map of the meta-store for the source shard and each target shard (SKJOLSVOLD: [0040]; It is noted that movement of a partition does not require physical movement of data at the storage level. Rather, in various implementations, a partition can be moved to another server by reassigning or remapping a partition range assignment of the partition to a new server (e.g. in a partition map));
federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system (SKJOLSVOLD: [0040]; It is noted that movement of a partition does not require physical movement of data at the storage level. Rather, in various implementations, a partition can be moved to another server by reassigning or remapping a partition range assignment of the partition to a new server (e.g. in a partition map));
determining whether all of the identified one or more source shards have been spilled to the plurality of target shards (SKJOLSVOLD: [0040]; It is noted that movement of a partition does not require physical movement of data at the storage level. Rather, in various implementations, a partition can be moved to another server by reassigning or remapping a partition range assignment of the partition to a new server (e.g. in a partition map)); and
in response to determining not all of the identified one or more source shards have been spilled to the plurality of target shards, repeating the selecting one of the identified one or more source shards, spilling the selected one of the identified one or more source shards to the selected plurality of target shards, updating the override map of the meta-store for the source shard and each target shard, and federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system until all of the identified one or more source shards have been spilled to the plurality of target shards (SKJOLSVOLD: [0233]; In some implementations, at least one candidate server is selected in an iteration of the searching, and in another iteration of the searching the at least one candidate server is updated to at least another server. The update may be based on a new candidate server having a greater available capacity or lower utilization in one or more dimensions. For example, servers 714 may be searched for assignment in a binary search or other type of search where each iteration of the search examines a different server or servers of servers 714 for the assignment. The at least one candidate server may be selected in one iteration and replaced by another server in a subsequent iteration. A new candidate server may have more available capacity, less utilization, or otherwise be more suitable for the partition or partitions being assigned. [0247]; The selection can be based on, for example, any combination of the number of partitions that are subject to load balancing, the size of those partitions, whether those partitions are presently assigned to a server or are unassigned, the number of servers available for assignment, the available capacity of those servers, or other factors).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify Shetty (teaches a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters, wherein the file key comprises a hash of an enterprise identifier (enterprise ID) for an entity to which the creator of the file is a member, a hash of a folder identifier (folder ID) for a location in which the file is stored, and a hash of a file identifier (file ID) uniquely identifying the file) with the teachings of Stringham (teaches logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards to identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list and saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped) with the teachings of ISHERWOOD (teaches partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards) to further include the teachings of SKJOLSVOLD (teaches spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards…federating, by the server of the cloud-based storage system, the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage system). One of ordinary skill in the art would have been motivated to make such a combination of providing better results in improving the load balancing with respect to the CPU utilization (See SKJOLSVOLD: [0039]). In addition, the references (Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD) teach features that are directed to analogous art and they are directed to the same field of endeavor as Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD are directed to mapping and storing partition information based on replicating data.
Regarding claim 18, the modification of Shetty, Stringham, and Isherwood teaches claimed invention substantially as claimed, however the modification of Shetty, Stringham, and Isherwood does not explicitly teach mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards further comprises: sorting the plurality of logical shards in descending order based on a load of each logical shard; assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards; sorting remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard; assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards; sorting remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard; assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards; determining whether all logical shards have been assigned to a physical shard; and in response to determining not all logical shards have been assigned to a physical shard, repeating said sorting of the plurality of logical shards in descending order based on a load of each logical shard, assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards, sorting of remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard, and assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards until all logical shards have been assigned to a physical shard.
SKJOLSVOLD teaches mapping each logical shard of the plurality of logical shards to one of the plurality of physical shards further comprises: sorting the plurality of logical shards in descending order based on a load of each logical shard (SKJOLSVOLD: [0231]; the search orders of at least one of the partitions being assigned and the servers to which the partitions are assigned are determined by ordering those partitions or servers based on corresponding dimensional values, for example in a list. For example, a search order can comprise sorting partitions 712 in ascending or descending order of corresponding partition values by one or multiple dimensions);
assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards (SKJOLSVOLD :[0231]-[0232]; In some implementations, the search orders of at least one of the partitions being assigned and the servers to which the partitions are assigned are determined by ordering those partitions or servers based on corresponding dimensional values, for example in a list. The partitions and/or servers may be sorted by the same dimension, multiple dimensions, or all dimensions. For example, partitions 712 may be sorted in ascending or descending order of partition values that all correspond to dimension 710 a);
sorting remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard (SKJOLSVOLD: [0069]; The propose candidates function can be configured to identify, select, and/or provide a plurality of candidate operations, where the plurality of candidate operations are potential load balancing operations associated with partitions assigned to servers of the distributed system. [0189]; The propose candidates function can select the plurality of candidate operations based on server load of the servers of the scalable storage. For example, the propose candidates function can sort each server of the scalable storage from busiest to idlest as quantified by a load metric, such as the server load metric described above. Thus, servers can be added to the candidate target server set based on server load, for example, based on having low server load);
assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards (SKJOLSVOLD: [0233]; In some implementations, at least one candidate server is selected in an iteration of the searching, and in another iteration of the searching the at least one candidate server is updated to at least another server. The update may be based on a new candidate server having a greater available capacity or lower utilization in one or more dimensions. For example, servers 714 may be searched for assignment in a binary search or other type of search where each iteration of the search examines a different server or servers of servers 714 for the assignment. The at least one candidate server may be selected in one iteration and replaced by another server in a subsequent iteration. A new candidate server may have more available capacity, less utilization, or otherwise be more suitable for the partition or partitions being assigned. [0247]; The selection can be based on, for example, any combination of the number of partitions that are subject to load balancing, the size of those partitions, whether those partitions are presently assigned to a server or are unassigned, the number of servers available for assignment, the available capacity of those servers, or other factors);
determining whether all logical shards have been assigned to a physical shard (SKJOLSVOLD: [0265]; In some implementations, the determination comprises dividing the servers into sets. One set can correspond to servers that have partitions that are to be offloaded in the assignment plan (also referred to as first server set). Another set can correspond to servers to which those partitions are to be assigned (also referred to as second server set)); and
in response to determining not all logical shards have been assigned to a physical shard, repeating said sorting of the plurality of logical shards in descending order based on a load of each logical shard, assigning one logical shard beginning from a top of the sorted ordered list to each physical shard of the plurality of physical shards, sorting of remaining unassigned logical shards in ascending order based on the load of each remaining unassigned logical shard, and assigning one logical shard beginning from the top of the sorted ordered list to each physical shard of the plurality of physical shards until all logical shards have been assigned to a physical shard (SKJOLSVOLD: [0233]; In some implementations, at least one candidate server is selected in an iteration of the searching, and in another iteration of the searching the at least one candidate server is updated to at least another server. The update may be based on a new candidate server having a greater available capacity or lower utilization in one or more dimensions. For example, servers 714 may be searched for assignment in a binary search or other type of search where each iteration of the search examines a different server or servers of servers 714 for the assignment. The at least one candidate server may be selected in one iteration and replaced by another server in a subsequent iteration. A new candidate server may have more available capacity, less utilization, or otherwise be more suitable for the partition or partitions being assigned. [0247]; The selection can be based on, for example, any combination of the number of partitions that are subject to load balancing, the size of those partitions, whether those partitions are presently assigned to a server or are unassigned, the number of servers available for assignment, the available capacity of those servers, or other factors).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify Shetty (teaches a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters, wherein the file key comprises a hash of an enterprise identifier (enterprise ID) for an entity to which the creator of the file is a member, a hash of a folder identifier (folder ID) for a location in which the file is stored, and a hash of a file identifier (file ID) uniquely identifying the file) with the teachings of Stringham (teaches logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards to identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list and saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped) with the teachings of ISHERWOOD (teaches partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards) to further include the teachings of SKJOLSVOLD (teaches spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards…federating, by the server of the cloud-based storage system, the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage system). One of ordinary skill in the art would have been motivated to make such a combination of providing better results in improving the load balancing with respect to the CPU utilization (See SKJOLSVOLD: [0039]). In addition, the references (Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD) teach features that are directed to analogous art and they are directed to the same field of endeavor as Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD are directed to mapping and storing partition information based on replicating data.
Regarding claim 19, the modification of Shetty, Stringham, and Isherwood teaches claimed invention substantially as claimed, however the modification of Shetty, Stringham, and Isherwood does not explicitly teach wherein the instructions further cause the processor to spill over a source shard of the plurality of physical shards to a target shard of the plurality of target shards by: identifying the source shard based on available storage capacity of each physical shard of the plurality of physical shards; identifying the target shard based on the available storage capacity of each physical shard of the plurality of physical shards; spilling content from the source shard to the target shard; updating an override map of the meta-store for the source shard and the target shard; and
federating the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage system.
SKJOLSVOLD teaches wherein the instructions further cause the processor to spill over a source shard of the plurality of physical shards to a target shard of the plurality of target shards by: identifying the source shard based on available storage capacity of each physical shard of the plurality of physical shards (SKJOLSVOLD: [0224]; In bin packing, any given server bin can have a size or volume that corresponds to capacity (e.g. a cap value) and an occupied volume that corresponds to present utilization (e.g. a server value or a sum of partition values of the server). [0238]-[0239]; The dimensional values can be selected from partition values and server values of the dimensions. For example, the analysis may be based on dimensional values that would correspond to the servers if the assignment plans were to be executed. The selection can also be based on whether or not a server in the assignment plan is exceeding a cap value or how many servers or cap values are being exceeded in the assignment plan);
identifying the target shard based on the available storage capacity of each physical shard of the plurality of physical shards (SKJOLSVOLD: [0224]; In bin packing, any given server bin can have a size or volume that corresponds to capacity (e.g. a cap value) and an occupied volume that corresponds to present utilization (e.g. a server value or a sum of partition values of the server). [0238]-[0239]; The dimensional values can be selected from partition values and server values of the dimensions. For example, the analysis may be based on dimensional values that would correspond to the servers if the assignment plans were to be executed. The selection can also be based on whether or not a server in the assignment plan is exceeding a cap value or how many servers or cap values are being exceeded in the assignment plan);
spilling content from the source shard to the target shard (SKJOLSVOLD: [0154]; Furthermore, the one or more servers can be identified based on server load, such that the plurality of candidate operations is selected based on server load of the one or more servers. For example, the propose candidates function can sort each server of the scalable storage from busiest to idlest as quantified by a load metric, such as a server load metric);
updating an override map of the meta-store for the source shard and the target shard (SKJOLSVOLD: [0040]; It is noted that movement of a partition does not require physical movement of data at the storage level. Rather, in various implementations, a partition can be moved to another server by reassigning or remapping a partition range assignment of the partition to a new server (e.g. in a partition map)); and
federating the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage system (SKJOLSVOLD: [0040]; It is noted that movement of a partition does not require physical movement of data at the storage level. Rather, in various implementations, a partition can be moved to another server by reassigning or remapping a partition range assignment of the partition to a new server (e.g. in a partition map)).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify Shetty (teaches a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters, wherein the file key comprises a hash of an enterprise identifier (enterprise ID) for an entity to which the creator of the file is a member, a hash of a folder identifier (folder ID) for a location in which the file is stored, and a hash of a file identifier (file ID) uniquely identifying the file) with the teachings of Stringham (teaches logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards to identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list and saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped) with the teachings of ISHERWOOD (teaches partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards) to further include the teachings of SKJOLSVOLD (teaches spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards…federating, by the server of the cloud-based storage system, the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage system). One of ordinary skill in the art would have been motivated to make such a combination of providing better results in improving the load balancing with respect to the CPU utilization (See SKJOLSVOLD: [0039]). In addition, the references (Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD) teach features that are directed to analogous art and they are directed to the same field of endeavor as Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD are directed to mapping and storing partition information based on replicating data.
Regarding claim 20, the modification of Shetty, Stringham, and Isherwood teaches claimed invention substantially as claimed, however the modification of Shetty, Stringham, and Isherwood does not explicitly teach wherein the instructions further cause the processor to spill over a source shard of the plurality of physical shards to a target shard of the plurality of target shards by: identifying available space for each node of the plurality of node in a cluster of the one or more clusters; selecting a plurality of target shards based on an overhead capacity of each shard of the plurality of shards on a node of the plurality of node having a most available space; identifying one or more source shards based on one or more overhead criteria; selecting one of the identified one or more source shards; spilling the selected one of the identified one or more source shards to the selected plurality of target shards; updating an override map of the meta-store for the source shard and each target shard; federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system; determining whether all of the identified one or more source shards have been spilled to the plurality of target shards; and in response to determining not all of the identified one or more source shards have been spilled to the plurality of target shards, repeating the selecting one of the identified one or more source shards, spilling the selected one of the identified one or more source shards to the selected plurality of target shards, updating the override map of the meta-store for the source shard and each target shard, and federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system until all of the identified one or more source shards have been spilled to the plurality of target shard.
SKJOLSVOLD teaches wherein the instructions further cause the processor to spill over a source shard of the plurality of physical shards to a target shard of the plurality of target shards by: identifying available space for each node of the plurality of node in a cluster of the one or more clusters (SKJOLSVOLD: [0203]; Assignment coordinator 702 can be configured to provide one or more servers 714 to which partitions 712 may be assigned. Servers 714 may comprise all servers of the scalable storage or a subset therefrom. Furthermore servers 714 may comprise all servers having available capacity in any dimension or dimensions or a given dimension or dimensions, or a subset therefrom);
selecting a plurality of target shards based on an overhead capacity of each shard of the plurality of shards on a node of the plurality of node having a most available space (SKJOLSVOLD: [0189]-[0190]; The propose candidates function can select the plurality of candidate operations based on server load of the servers of the scalable storage. For example, the propose candidates function can sort each server of the scalable storage from busiest to idlest as quantified by a load metric, such as the server load metric described above. The candidate target server set may be limited to a number of those servers that have the lowest server load. Thus, servers can be added to the candidate target server set based on server load, for example, based on having low server load. Thus, a load balancing operation could comprise a given partition from the candidate partitions set being moved to a given server of the candidate target server set, for each combination of partition and server in the sets);
identifying one or more source shards based on one or more overhead criteria (SKJOLSVOLD: [0200]; Assignment coordinator 702 can be configured to manage the generation of assignment plans for partitions to servers. In this respect, assignment coordinator 702 can be configured to collect and provide one or more partitions 712 to assignment generator 704 for assignment and one or more servers 714 (e.g. servers 204, 206, 208, and 210 in FIG. 2) of scalable storage (e.g. scalable storage 200 in FIG. 2) to assignment generator 704 that may receive those assignments);
selecting one of the identified one or more source shards (SKJOLSVOLD: [0152]; In some implementations, selecting the plurality of candidate operations comprises identifying one or more servers that have a server metric of a dimension, or a particular dimension that exceeds a threshold value, such as the server cap. [0154]; Furthermore, the one or more servers can be identified based on server load, such that the plurality of candidate operations is selected based on server load of the one or more servers. [0156]; Furthermore, the propose candidates function may sort the servers of the scalable storage from highest to lowest server metric of the dimension. A candidate target server set may be selected as a number of those servers that have the lowest server metric of the dimension);
spilling the selected one of the identified one or more source shards to the selected plurality of target shards (SKJOLSVOLD: [0154]; Furthermore, the one or more servers can be identified based on server load, such that the plurality of candidate operations is selected based on server load of the one or more servers. For example, the propose candidates function can sort each server of the scalable storage from busiest to idlest as quantified by a load metric, such as a server load metric);
updating an override map of the meta-store for the source shard and each target shard (SKJOLSVOLD: [0040]; It is noted that movement of a partition does not require physical movement of data at the storage level. Rather, in various implementations, a partition can be moved to another server by reassigning or remapping a partition range assignment of the partition to a new server (e.g. in a partition map));
federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system (SKJOLSVOLD: [0040]; It is noted that movement of a partition does not require physical movement of data at the storage level. Rather, in various implementations, a partition can be moved to another server by reassigning or remapping a partition range assignment of the partition to a new server (e.g. in a partition map));
determining whether all of the identified one or more source shards have been spilled to the plurality of target shards (SKJOLSVOLD: [0040]; It is noted that movement of a partition does not require physical movement of data at the storage level. Rather, in various implementations, a partition can be moved to another server by reassigning or remapping a partition range assignment of the partition to a new server (e.g. in a partition map)); and
in response to determining not all of the identified one or more source shards have been spilled to the plurality of target shards, repeating the selecting one of the identified one or more source shards, spilling the selected one of the identified one or more source shards to the selected plurality of target shards, updating the override map of the meta-store for the source shard and each target shard, and federating the update to the override map of the meta-store for the source shard and each target shard to one or more components of the cloud-based storage system until all of the identified one or more source shards have been spilled to the plurality of target shard (SKJOLSVOLD: [0233]; In some implementations, at least one candidate server is selected in an iteration of the searching, and in another iteration of the searching the at least one candidate server is updated to at least another server. The update may be based on a new candidate server having a greater available capacity or lower utilization in one or more dimensions. For example, servers 714 may be searched for assignment in a binary search or other type of search where each iteration of the search examines a different server or servers of servers 714 for the assignment. The at least one candidate server may be selected in one iteration and replaced by another server in a subsequent iteration. A new candidate server may have more available capacity, less utilization, or otherwise be more suitable for the partition or partitions being assigned. [0247]; The selection can be based on, for example, any combination of the number of partitions that are subject to load balancing, the size of those partitions, whether those partitions are presently assigned to a server or are unassigned, the number of servers available for assignment, the available capacity of those servers, or other factors).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify Shetty (teaches a file key for each file of a plurality of files stored in a plurality of physical shards, each physical shard maintained by a node of a plurality of nodes in one or more clusters, wherein the file key comprises a hash of an enterprise identifier (enterprise ID) for an entity to which the creator of the file is a member, a hash of a folder identifier (folder ID) for a location in which the file is stored, and a hash of a file identifier (file ID) uniquely identifying the file) with the teachings of Stringham (teaches logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards to identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list and saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped) with the teachings of ISHERWOOD (teaches partially re-indexing the plurality of shards comprises selecting a set of largest shards in the one or more clusters based on an index size of each shard, redirecting traffic for each shard of the selected set of shards to a different cluster of the one or more clusters while the plurality of shards are being partially reindexed, and re-directing traffic back to each shard of the selected set of shards upon completion of partially reindexing the plurality of shards) to further include the teachings of SKJOLSVOLD (teaches spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards…federating, by the server of the cloud-based storage system, the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage system). One of ordinary skill in the art would have been motivated to make such a combination of providing better results in improving the load balancing with respect to the CPU utilization (See SKJOLSVOLD: [0039]). In addition, the references (Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD) teach features that are directed to analogous art and they are directed to the same field of endeavor as Shetty, Stringham, ISHERWOOD, and SKJOLSVOLD are directed to mapping and storing partition information based on replicating data.
Allowable Subject Matter
Claims 7 and 14 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter: As recited above, Shetty teaches “logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards to identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list and saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped” and Stringham teaches “logically partitioning, by the server of the cloud-based storage system, the ordered list into a plurality of logical shards to identifying, by the server of the cloud-based storage system, a last key value for each logical shard in the partitioned ordered list and saving, by the server of the cloud-based storage system, the identified last key value for each logical shard in a meta-store for the physical shard to which the logical shard is mapped...”. Additionally, SKJOLSVOLD teaches “…spilling over a source shard of the plurality of physical shards to a target shard of the plurality of target shards…identifying, by the server of the cloud-based storage…identifying, by the server of the cloud-based storage system, the target shard based on the available storage capacity of each physical shard of the plurality of physical shards; spilling…the update to the override map of the meta-store for the source shard and the target shard to one or more components of the cloud-based storage system”.
However, the cited prior arts do not teach “…selecting, by the server of the cloud-based storage system, a set of largest shards in the one or more clusters based on an index size….indexing, by the server of the cloud-based storage system, a first half of the key range for each shard of the selected set of the largest shards on a source node…indexing, by the server of the cloud-based storage system, a second half of the key range for each shard…node different from source node for the shard….redirecting, by the server of the cloud-based storage system, traffic back to each shard of the selected set of shards…”.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
U.S Patent 11,030,169 issued to Wu et al. (hereinafter as “Wu”) teaches processing and storage responsibility for a data set to be split based on the load associated with the shards of the dataset and re-sharding causes data to be split among additional computing nodes.
U.S Patent Application Publication 2016/0203168 issued to Gangadharappa et al. (hereinafter as “Gangadharappa”) teaches a first and second data changed to the data stored in a distributed database to be received by reindexing and revising data based on first and second data changes.
U.S Patent Application Publication 2013/0290249 issued to Merriman et al. (hereinafter as “Merriman”) teaches managing asynchronous replication in a distributed database environment by providing scaling in a distributed database.
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANDREW N HO whose telephone number is (571)270-0590. The examiner can normally be reached Tuesday and Thursday 10:00-6:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sherief Badawi can be reached at (571) 272-9782. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
4/14/2026
/ANDREW N HO/Examiner
Art Unit 2169
/SHERIEF BADAWI/Supervisory Patent Examiner, Art Unit 2169