DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
Applicant's response filed on 21 October 2025 has been considered and entered. Accordingly, claims 1-2, 4-5, 7-12, 14-15, 17-22, 24-25, and 27-30 are pending in this application. Claims 1, 11 and 21 are currently amended; claims 2, 4-5, 7-10, 12, 14-15, 17-20, 22, 24-25, 27-30 are previously presented; and claims 3, 6, 13, 16, 23, and 26 are cancelled.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 11 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Mathew et al. (previously presented) (US 11,132,267 B1) hereinafter Mathew, in view of Federwisch et al. (previously presented) (US 2003/0182313 A1) hereinafter Federwisch and further in view of Diederich et al. (US 2016/0259809 A1) hereinafter Diederich.
As to claim 1, Mathew discloses a method comprising: generating a base snapshot of a distributed share used to replicate data from a source file system (FS) site over a network to a target FS site (Col. 4 line 63-67; Col. 5 line 1-3, A partial snapshot procedure is illustrated in FIG. 3. When at 302 it is determined that at least one node is not available, at 304 the system takes a snapshot of the relevant nodes that are available, while making a record of the nodes that were not available. At 306 a partial diff file is created by including the differences from the current snapshot to the immediately preceding snapshot, i.e., a base snapshot, but only for nodes that are currently online. Col. 3 line 17-21, Once the snapshot is created, the diff is formed again, but only for the partial snapshot that was previously unavailable. The diff is then sent to the destination, i.e., a target FS site, and the metadata updated accordingly reflecting the snapshot. Col. 2 line 47-49, “a system and method that maintains replication availability when a node in a clustered or distributed storage environment fails”. Thus, generating a base snapshot of a distributed share used to replicate data from a source file system (FS) site over a network to a target FS site.), wherein the base snapshot is a file system-level snapshot employing metadata of a virtualized file system to preserve a file system hierarchy during snapshot generation (Col. 3 line 30-37, “a distributed file system is employed, such that the data in the primary backup appliance 110 spans multiple nodes. The file system metadata has information about the directory structure-this information is only on the meta node of the mtree, i.e., a file system hierarchy. The file system also has references to inode structures for filenames in the mtree, which is also on the meta node. For files that are placed on remote nodes, this is a cached copy on the metadata node.”. Col. 3 line 62-67, “The process of taking a snapshot, will save a point in time image of the btree files in meta node and the remote nodes. Unlike the prior art, in disclosed embodiments when a node or a disk fails, replication proceeds by taking a (partial) snapshot of the mtree/filesystem among the nodes that are available.”. Col. 2 line 54-58, “When the node returns online, prior to allowing any write operation, a snapshot with the data of the newly joined node is obtained. Upon completing the update relating to the rejoined node, the entire snapshot is current and VSO is restored.”. Thus, the base snapshot is a file system-level snapshot employing metadata of a virtualized file system to preserve a file system hierarchy during snapshot generation.);
generating one or more subsequent file-system level snapshots of the distributed share at the source FS site (Col. 4 line 63-67; Col. 5 line 1-3, A partial snapshot procedure is illustrated in FIG. 3. When at 302 it is determined that at least one node is not available, at 304 the system takes a snapshot of the relevant nodes that are available, while making a record of the nodes that were not available. At 306 a partial diff file is created by including the differences from the current snapshot, i.e., a subsequent file-system level snapshot, to the immediately preceding snapshot, but only for nodes that are currently online. Col. 4 line 37-44, “the snapshot n+1 will be constructed on cp2 when cp2 comes up-before any new writes can happen in cp2. A diff between all files for cp2 will be done between snapshot n and snapshot n+ 1 and the deltas will be sent to the replica. At this time all files in replica will be of snapshot 'n+ 1'. In this manner, the data that was missing from the time of snapshot n to the time of failure is recaptured.”. Thus, a subsequent file-system level snapshot of the distributed share generated at the source FS site.);
computing deltas of the changed files (Col. 3 line 13-19, When the down node comes up, prior to allowing any write operation, the system repeats the creation of snapshot. Since this has to be done before any write occurs to that portion of the filesystem, this process should be part of filesystem/mtree recovery. Once the snapshot is created, the diff is formed again, i.e., computing deltas, but only for the partial snapshot that was previously unavailable. Col. 4 line 37-41, the snapshot n+1 will be constructed on cp2 when cp2 comes up-before any new writes can happen in cp2. A diff between all files for cp2 will be done between snapshot n and snapshot n+ 1 and the deltas will be sent to the replica. Col. 4 line 3-7, Each snapshot is identified by a snapshot ID (sid) which increments with the progression of snapshots taken. In this way, when a failed node comes back online, its last sid can be compared to the current sid to determine how many snapshot cycles it missed. Thus, the deltas of the changed files are being computed.); and
replicating the data corresponding to the deltas of the changed files to the target FS site (Col. 3 line 17-21, Once the snapshot is created, the diff is formed again, but only for the partial snapshot that was previously unavailable. The diff is then sent to the destination, i.e., the target FS site, and the metadata updated accordingly reflecting the snapshot. Col. 4 line 26-36, When the down node comes up, prior to allowing any write to that node, the partial replication must be updated. When the node comes back online, it needs to check two things: first, it checks for the parent directory to indicate that it is back online and the files on that node are now accessible. Second, it check the sid of its last snapshot to determine how many snapshot generation it missed while being oflline. A snapshot is then created and the diff is done again, but only for the missing part of the prior partial snapshot. The "rework" diff is sent to the destination, i.e., replicating the data to the target FS site, and the metadata is updated accordingly reflecting the snapshot. Thus, the data corresponding to the deltas of the changed files replicated to the target FS site.).
Mathew does not explicitly disclose performing a hierarchical construct level compare between the base and subsequent snapshots to identify altered inodes resulting from metadata corresponding to one or more changed files; wherein the deltas are associated with blocks of the changed files identified via traversal of the blocks using the altered inodes.
However, in the same field of endeavor, Federwisch discloses performing a hierarchical construct level compare between the base and subsequent snapshots to identify altered inodes resulting from metadata corresponding to one or more changed files (Fig. 4; 8, Para. 42, “Each filer 310, 312 also includes a storage operating system 400 (FIG. 4) that implements a file system to logically organize the information as a hierarchical structure of directories and files on the disks”. Para. 64, the transmission of incremental changes in snapshot data, i.e., subsequent snapshots, based upon a comparison of changed blocks in the whole volume is advantageous in that it transfers only incremental changes in data rather than a complete file system snapshot, thereby allowing updates to be smaller and faster. Para. 72, A scanner 820 searches the index for changed base/incremental inode file snapshot blocks, i.e., to identify altered inodes, comparing volume block numbers or another identifier. In the example of FIG. 8, block 4 in the base snapshot inode file 810 now corresponds in the file scan order to block 3 in the incremental snapshot inode file 812. This indicates a change of one or more underlying inodes, i.e., altered inodes. In addition, block 7 in the base snapshot inode file appears as block 8 in the incremental snapshot inode file. Para. 61, “Note that metadata in any snapshotted blocks (e.g. blocks 510, 515 and 520C) protects these blocks from recycled or overwritten until they are released from all snapshots.”. Thus, performing a hierarchical construct level compare between the base and subsequent snapshots to identify altered inodes resulting from metadata corresponding to one or more changed files.);
wherein the deltas are associated with blocks of the changed files identified via traversal of the blocks using the altered inodes (Para. 14, Trees of blocks associated with the files are traversed, bypassing unchanged pointers between versions and walking down to identify the changes, i.e., the deltas, in the hierarchy of the tree. These changes are transmitted to the destination mirror or replicated snapshot. This technique allows regular files, directories, inodes and any other hierarchical structure to be efficiently scanned to determine differences between versions thereof. Para. 15, the source scans, with the scanner, along the index of logical file blocks for each snapshot looking for changed volume block numbers between the two source snapshots. Since disk blocks are always rewritten to new locations on the disk, a difference indicates changes, i.e., deltas, in the underlying inodes of the respective blocks, i.e., the deltas are associated with blocks. Using an inode picker process, that receives changed blocks from the scanner the source picks out inodes from changed blocks specifically associated with the selected qtree (or other sub-organization of the volume). Para. 82, “the tree may in fact contain several changed branches, is requiring the worker (in fact, the above-described scanner 820 process) to traverse each of the branches in a recursive manner until all changes are identified.”. Thus, the deltas are associated with blocks of the changed files identified via traversal of the blocks using the altered inodes.).
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Mathew such that the trees of blocks such as the hierarchical structure of Federwisch can be implemented in the environment of Mathew in order to compute deltas by using the identified altered inodes of Federwisch from changed blocks as suggested by Federwisch (Para. 72). The picker process looks for versions of inodes that have changed between the two snapshots and picks out the changed version (Federwisch, Para. 15). The identified changed data block is then transferred over the network to become part of the changed volume snapshot set at the destination as a changed block. One of the ordinary skills in the art would have motivated to make this modification in order to allow updates to be smaller and faster by transferring only incremental changes in data rather than a complete file system snapshot to the destination as suggested by Federwisch (Para. 64).
Combination of Mathew and Federwisch do not explicitly disclose maintaining a replay list for files failing replication, wherein the failed files are replicated during a subsequent replication cycle.
However, in the same filed of endeavor, Diederich discloses maintaining a replay list for files failing replication, wherein the failed files are replicated during a subsequent replication cycle (Para. 18, “to help ensure that failed files are properly backed up, replicated, deleted, and/or modified, failed files can be recorded by respective incremental data analysis processes 108a-n in one or more fault lists (e.g., a fault backup list, a fault replication list, a fault deletion list, and a fault modification list) for addition to the top of the incremental candidate list. In this manner, the appropriate operations can again be performed for those failed files prior to performing operations on the remaining files in the incremental candidate list”, where a fault replication list represents files failing replication. Para. 20, one or more of incremental data analysis processes 108a-n can create a fault backup list and a fault replication list, i.e., maintaining a replay list for files failing replication, where the fault backup list includes information identifying failed files that were not were not successfully backed up to backup storage pool 112, and the fault replication list includes information identifying failed files that were not successfully replicated to replication storage pool 114. By identifying and adding failed files to the fault lists, incremental data analysis processes 108a-n can add failed files to the incremental candidate list for a subsequent operations, i.e., a subsequent replication cycle. Thus, the one or more of incremental data analysis processes maintaining a replay list for files failing replication, wherein the failed files are replicated during a subsequent replication cycle.).
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Diederich into the combined method of Mathew and Federwisch by creating a fault replication list such as a replay list for files failing replication by using the diff file of Mathew during the failure of the system in order to help ensure that failed files are properly replicated prior to performing operations on the remaining files in the incremental candidate list as suggested by Diederich (Para. 18). One or more of incremental data analysis processes can create a fault replication list, where the fault replication list includes information identifying failed files that were not successfully replicated to replication storage pool (Para. 20). File analysis program performs a single, shared scan phase of file system and generates shared scan output that can be used for incremental data analysis processes and their respective operations (Para. 26). One of the ordinary skills in the art would have motivated to make this modification in order to reduce the amount of time and computing resources that would otherwise be consumed by performing separate walkthroughs of a file system during separate scan phases by using a shared scan output for various file-level incremental data analysis systems as suggested by Diederich (Para. 0008).
As to claim 11, Mathew discloses a non-transitory computer readable medium including program instructions for execution on a processor (Col. 5 line 55-61), the program instructions configured to: generate a base snapshot of a distributed share used to replicate data from a source file system (FS) site over a network to a target FS site (Col. 4 line 63-67; Col. 5 line 1-3, A partial snapshot procedure is illustrated in FIG. 3. When at 302 it is determined that at least one node is not available, at 304 the system takes a snapshot of the relevant nodes that are available, while making a record of the nodes that were not available. At 306 a partial diff file is created by including the differences from the current snapshot to the immediately preceding snapshot, i.e., a base snapshot, but only for nodes that are currently online. Col. 3 line 17-21, Once the snapshot is created, the diff is formed again, but only for the partial snapshot that was previously unavailable. The diff is then sent to the destination, i.e., a target FS site, and the metadata updated accordingly reflecting the snapshot. Col. 2 line 47-49, “a system and method that maintains replication availability when a node in a clustered or distributed storage environment fails”. Thus, generating a base snapshot of a distributed share used to replicate data from a source file system (FS) site over a network to a target FS site.), wherein the base snapshot is a file system-level snapshot employing metadata of a virtualized file system to preserve a file system hierarchy during snapshot generation (Col. 3 line 30-37, “a distributed file system is employed, such that the data in the primary backup appliance 110 spans multiple nodes. The file system metadata has information about the directory structure-this information is only on the meta node of the mtree, i.e., a file system hierarchy. The file system also has references to inode structures for filenames in the mtree, which is also on the meta node. For files that are placed on remote nodes, this is a cached copy on the metadata node.”. Col. 3 line 62-67, “The process of taking a snapshot, will save a point in time image of the btree files in meta node and the remote nodes. Unlike the prior art, in disclosed embodiments when a node or a disk fails, replication proceeds by taking a (partial) snapshot of the mtree/filesystem among the nodes that are available.”. Col. 2 line 54-58, “When the node returns online, prior to allowing any write operation, a snapshot with the data of the newly joined node is obtained. Upon completing the update relating to the rejoined node, the entire snapshot is current and VSO is restored.”. Thus, the base snapshot is a file system-level snapshot employing metadata of a virtualized file system to preserve a file system hierarchy during snapshot generation.);
generate one or more subsequent file-system level snapshots of the distributed share at the source FS site (Col. 4 line 63-67; Col. 5 line 1-3, A partial snapshot procedure is illustrated in FIG. 3. When at 302 it is determined that at least one node is not available, at 304 the system takes a snapshot of the relevant nodes that are available, while making a record of the nodes that were not available. At 306 a partial diff file is created by including the differences from the current snapshot, i.e., a subsequent file-system level snapshot, to the immediately preceding snapshot, but only for nodes that are currently online. Col. 4 line 37-44, “the snapshot n+1 will be constructed on cp2 when cp2 comes up-before any new writes can happen in cp2. A diff between all files for cp2 will be done between snapshot n and snapshot n+ 1 and the deltas will be sent to the replica. At this time all files in replica will be of snapshot 'n+ 1'. In this manner, the data that was missing from the time of snapshot n to the time of failure is recaptured.”. Thus, a subsequent file-system level snapshot of the distributed share generated at the source FS site.);
compute deltas of the changed files (Col. 3 line 13-19, When the down node comes up, prior to allowing any write operation, the system repeats the creation of snapshot. Since this has to be done before any write occurs to that portion of the filesystem, this process should be part of filesystem/mtree recovery. Once the snapshot is created, the diff is formed again, i.e., computing deltas, but only for the partial snapshot that was previously unavailable. Col. 4 line 37-41, the snapshot n+1 will be constructed on cp2 when cp2 comes up-before any new writes can happen in cp2. A diff between all files for cp2 will be done between snapshot n and snapshot n+ 1 and the deltas will be sent to the replica. Col. 4 line 3-7, Each snapshot is identified by a snapshot ID (sid) which increments with the progression of snapshots taken. In this way, when a failed node comes back online, its last sid can be compared to the current sid to determine how many snapshot cycles it missed. Thus, the deltas of the changed files are being computed.); and
replicate the data corresponding to the deltas of the changed files to the target FS site (Col. 3 line 17-21, Once the snapshot is created, the diff is formed again, but only for the partial snapshot that was previously unavailable. The diff is then sent to the destination, i.e., the target FS site, and the metadata updated accordingly reflecting the snapshot. Col. 4 line 26-36, When the down node comes up, prior to allowing any write to that node, the partial replication must be updated. When the node comes back online, it needs to check two things: first, it checks for the parent directory to indicate that it is back online and the files on that node are now accessible. Second, it check the sid of its last snapshot to determine how many snapshot generation it missed while being oflline. A snapshot is then created and the diff is done again, but only for the missing part of the prior partial snapshot. The "rework" diff is sent to the destination, i.e., replicating the data to the target FS site, and the metadata is updated accordingly reflecting the snapshot. Thus, the data corresponding to the deltas of the changed files replicated to the target FS site.).
Mathew does not explicitly disclose perform a hierarchical construct level compare between the base and subsequent snapshots to identify altered inodes resulting from metadata corresponding to one or more changed files; wherein the deltas are associated with blocks of the changed files identified via traversal of the blocks using the altered inodes.
However, in the same field of endeavor, Federwisch discloses perform a hierarchical construct level compare between the base and subsequent snapshots to identify altered inodes resulting from metadata corresponding to one or more changed files (Fig. 4; 8, Para. 42, “Each filer 310, 312 also includes a storage operating system 400 (FIG. 4) that implements a file system to logically organize the information as a hierarchical structure of directories and files on the disks”. Para. 64, the transmission of incremental changes in snapshot data, i.e., subsequent snapshots, based upon a comparison of changed blocks in the whole volume is advantageous in that it transfers only incremental changes in data rather than a complete file system snapshot, thereby allowing updates to be smaller and faster. Para. 72, A scanner 820 searches the index for changed base/incremental inode file snapshot blocks, i.e., to identify altered inodes, comparing volume block numbers or another identifier. In the example of FIG. 8, block 4 in the base snapshot inode file 810 now corresponds in the file scan order to block 3 in the incremental snapshot inode file 812. This indicates a change of one or more underlying inodes, i.e., altered inodes. In addition, block 7 in the base snapshot inode file appears as block 8 in the incremental snapshot inode file. Para. 61, “Note that metadata in any snapshotted blocks (e.g. blocks 510, 515 and 520C) protects these blocks from recycled or overwritten until they are released from all snapshots.”. Thus, performing a hierarchical construct level compare between the base and subsequent snapshots to identify altered inodes resulting from metadata corresponding to one or more changed files.);
wherein the deltas are associated with blocks of the changed files identified via traversal of the blocks using the altered inodes (Para. 14, Trees of blocks associated with the files are traversed, bypassing unchanged pointers between versions and walking down to identify the changes, i.e., the deltas, in the hierarchy of the tree. These changes are transmitted to the destination mirror or replicated snapshot. This technique allows regular files, directories, inodes and any other hierarchical structure to be efficiently scanned to determine differences between versions thereof. Para. 15, the source scans, with the scanner, along the index of logical file blocks for each snapshot looking for changed volume block numbers between the two source snapshots. Since disk blocks are always rewritten to new locations on the disk, a difference indicates changes, i.e., deltas, in the underlying inodes of the respective blocks, i.e., the deltas are associated with blocks. Using an inode picker process, that receives changed blocks from the scanner the source picks out inodes from changed blocks specifically associated with the selected qtree (or other sub-organization of the volume). Para. 82, “the tree may in fact contain several changed branches, is requiring the worker (in fact, the above-described scanner 820 process) to traverse each of the branches in a recursive manner until all changes are identified.”. Thus, the deltas are associated with blocks of the changed files identified via traversal of the blocks using the altered inodes.).
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Mathew such that the trees of blocks such as the hierarchical structure of Federwisch can be implemented in the environment of Mathew in order to compute deltas by using the identified altered inodes of Federwisch from changed blocks as suggested by Federwisch (Para. 72). The picker process looks for versions of inodes that have changed between the two snapshots and picks out the changed version (Federwisch, Para. 15). The identified changed data block is then transferred over the network to become part of the changed volume snapshot set at the destination as a changed block. One of the ordinary skills in the art would have motivated to make this modification in order to allow updates to be smaller and faster by transferring only incremental changes in data rather than a complete file system snapshot to the destination as suggested by Federwisch (Para. 64).
Combination of Mathew and Federwisch do not explicitly disclose maintaining a replay list for files failing replication, wherein the failed files are replicated during a subsequent replication cycle.
However, in the same filed of endeavor, Diederich discloses maintaining a replay list for files failing replication, wherein the failed files are replicated during a subsequent replication cycle (Para. 18, “to help ensure that failed files are properly backed up, replicated, deleted, and/or modified, failed files can be recorded by respective incremental data analysis processes 108a-n in one or more fault lists (e.g., a fault backup list, a fault replication list, a fault deletion list, and a fault modification list) for addition to the top of the incremental candidate list. In this manner, the appropriate operations can again be performed for those failed files prior to performing operations on the remaining files in the incremental candidate list”, where a fault replication list represents files failing replication. Para. 20, one or more of incremental data analysis processes 108a-n can create a fault backup list and a fault replication list, i.e., maintaining a replay list for files failing replication, where the fault backup list includes information identifying failed files that were not were not successfully backed up to backup storage pool 112, and the fault replication list includes information identifying failed files that were not successfully replicated to replication storage pool 114. By identifying and adding failed files to the fault lists, incremental data analysis processes 108a-n can add failed files to the incremental candidate list for a subsequent operations, i.e., a subsequent replication cycle. Thus, the one or more of incremental data analysis processes maintaining a replay list for files failing replication, wherein the failed files are replicated during a subsequent replication cycle.).
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Diederich into the combined method of Mathew and Federwisch by creating a fault replication list such as a replay list for files failing replication by using the diff file of Mathew during the failure of the system in order to help ensure that failed files are properly replicated prior to performing operations on the remaining files in the incremental candidate list as suggested by Diederich (Para. 18). One or more of incremental data analysis processes can create a fault replication list, where the fault replication list includes information identifying failed files that were not successfully replicated to replication storage pool (Para. 20). File analysis program performs a single, shared scan phase of file system and generates shared scan output that can be used for incremental data analysis processes and their respective operations (Para. 26). One of the ordinary skills in the art would have motivated to make this modification in order to reduce the amount of time and computing resources that would otherwise be consumed by performing separate walkthroughs of a file system during separate scan phases by using a shared scan output for various file-level incremental data analysis systems as suggested by Diederich (Para. 0008).
As to claim 21, Mathew discloses an apparatus comprising: a network connecting one or more nodes of source file system (FS) sites to one or more target FS sites (Col. 6 line 1-22), each node having a processor configured to execute program instructions to (Col. 5 line 55-61): generate a base snapshot of a distributed share used to replicate data from a source file system (FS) site over a network to a target FS site (Col. 4 line 63-67; Col. 5 line 1-3, A partial snapshot procedure is illustrated in FIG. 3. When at 302 it is determined that at least one node is not available, at 304 the system takes a snapshot of the relevant nodes that are available, while making a record of the nodes that were not available. At 306 a partial diff file is created by including the differences from the current snapshot to the immediately preceding snapshot, i.e., a base snapshot, but only for nodes that are currently online. Col. 3 line 17-21, Once the snapshot is created, the diff is formed again, but only for the partial snapshot that was previously unavailable. The diff is then sent to the destination, i.e., a target FS site, and the metadata updated accordingly reflecting the snapshot. Col. 2 line 47-49, “a system and method that maintains replication availability when a node in a clustered or distributed storage environment fails”. Thus, generating a base snapshot of a distributed share used to replicate data from a source file system (FS) site over a network to a target FS site.), wherein the base snapshot is a file system-level snapshot employing metadata of a virtualized file system to preserve a file system hierarchy during snapshot generation (Col. 3 line 30-37, “a distributed file system is employed, such that the data in the primary backup appliance 110 spans multiple nodes. The file system metadata has information about the directory structure-this information is only on the meta node of the mtree, i.e., a file system hierarchy. The file system also has references to inode structures for filenames in the mtree, which is also on the meta node. For files that are placed on remote nodes, this is a cached copy on the metadata node.”. Col. 3 line 62-67, “The process of taking a snapshot, will save a point in time image of the btree files in meta node and the remote nodes. Unlike the prior art, in disclosed embodiments when a node or a disk fails, replication proceeds by taking a (partial) snapshot of the mtree/filesystem among the nodes that are available.”. Col. 2 line 54-58, “When the node returns online, prior to allowing any write operation, a snapshot with the data of the newly joined node is obtained. Upon completing the update relating to the rejoined node, the entire snapshot is current and VSO is restored.”. Thus, the base snapshot is a file system-level snapshot employing metadata of a virtualized file system to preserve a file system hierarchy during snapshot generation.);
generate one or more subsequent file-system level snapshots of the distributed share at the source FS site (Col. 4 line 63-67; Col. 5 line 1-3, A partial snapshot procedure is illustrated in FIG. 3. When at 302 it is determined that at least one node is not available, at 304 the system takes a snapshot of the relevant nodes that are available, while making a record of the nodes that were not available. At 306 a partial diff file is created by including the differences from the current snapshot, i.e., a subsequent file-system level snapshot, to the immediately preceding snapshot, but only for nodes that are currently online. Col. 4 line 37-44, “the snapshot n+1 will be constructed on cp2 when cp2 comes up-before any new writes can happen in cp2. A diff between all files for cp2 will be done between snapshot n and snapshot n+ 1 and the deltas will be sent to the replica. At this time all files in replica will be of snapshot 'n+ 1'. In this manner, the data that was missing from the time of snapshot n to the time of failure is recaptured.”. Thus, a subsequent file-system level snapshot of the distributed share generated at the source FS site.);
compute deltas of the changed files (Col. 3 line 13-19, When the down node comes up, prior to allowing any write operation, the system repeats the creation of snapshot. Since this has to be done before any write occurs to that portion of the filesystem, this process should be part of filesystem/mtree recovery. Once the snapshot is created, the diff is formed again, i.e., computing deltas, but only for the partial snapshot that was previously unavailable. Col. 4 line 37-41, the snapshot n+1 will be constructed on cp2 when cp2 comes up-before any new writes can happen in cp2. A diff between all files for cp2 will be done between snapshot n and snapshot n+ 1 and the deltas will be sent to the replica. Col. 4 line 3-7, Each snapshot is identified by a snapshot ID (sid) which increments with the progression of snapshots taken. In this way, when a failed node comes back online, its last sid can be compared to the current sid to determine how many snapshot cycles it missed. Thus, the deltas of the changed files are being computed.); and
replicate the data corresponding to the deltas of the changed files to the target FS site (Col. 3 line 17-21, Once the snapshot is created, the diff is formed again, but only for the partial snapshot that was previously unavailable. The diff is then sent to the destination, i.e., the target FS site, and the metadata updated accordingly reflecting the snapshot. Col. 4 line 26-36, When the down node comes up, prior to allowing any write to that node, the partial replication must be updated. When the node comes back online, it needs to check two things: first, it checks for the parent directory to indicate that it is back online and the files on that node are now accessible. Second, it check the sid of its last snapshot to determine how many snapshot generation it missed while being oflline. A snapshot is then created and the diff is done again, but only for the missing part of the prior partial snapshot. The "rework" diff is sent to the destination, i.e., replicating the data to the target FS site, and the metadata is updated accordingly reflecting the snapshot. Thus, the data corresponding to the deltas of the changed files replicated to the target FS site.).
Mathew does not explicitly disclose perform a hierarchical construct level compare between the base and subsequent snapshots to identify altered inodes resulting from metadata corresponding to one or more changed files; wherein the deltas are associated with blocks of the changed files identified via traversal of the blocks using the altered inodes.
However, in the same field of endeavor, Federwisch discloses perform a hierarchical construct level compare between the base and subsequent snapshots to identify altered inodes resulting from metadata corresponding to one or more changed files (Fig. 4; 8, Para. 42, “Each filer 310, 312 also includes a storage operating system 400 (FIG. 4) that implements a file system to logically organize the information as a hierarchical structure of directories and files on the disks”. Para. 64, the transmission of incremental changes in snapshot data, i.e., subsequent snapshots, based upon a comparison of changed blocks in the whole volume is advantageous in that it transfers only incremental changes in data rather than a complete file system snapshot, thereby allowing updates to be smaller and faster. Para. 72, A scanner 820 searches the index for changed base/incremental inode file snapshot blocks, i.e., to identify altered inodes, comparing volume block numbers or another identifier. In the example of FIG. 8, block 4 in the base snapshot inode file 810 now corresponds in the file scan order to block 3 in the incremental snapshot inode file 812. This indicates a change of one or more underlying inodes, i.e., altered inodes. In addition, block 7 in the base snapshot inode file appears as block 8 in the incremental snapshot inode file. Para. 61, “Note that metadata in any snapshotted blocks (e.g. blocks 510, 515 and 520C) protects these blocks from recycled or overwritten until they are released from all snapshots.”. Thus, performing a hierarchical construct level compare between the base and subsequent snapshots to identify altered inodes resulting from metadata corresponding to one or more changed files.);
wherein the deltas are associated with blocks of the changed files identified via traversal of the blocks using the altered inodes (Para. 14, Trees of blocks associated with the files are traversed, bypassing unchanged pointers between versions and walking down to identify the changes, i.e., the deltas, in the hierarchy of the tree. These changes are transmitted to the destination mirror or replicated snapshot. This technique allows regular files, directories, inodes and any other hierarchical structure to be efficiently scanned to determine differences between versions thereof. Para. 15, the source scans, with the scanner, along the index of logical file blocks for each snapshot looking for changed volume block numbers between the two source snapshots. Since disk blocks are always rewritten to new locations on the disk, a difference indicates changes, i.e., deltas, in the underlying inodes of the respective blocks, i.e., the deltas are associated with blocks. Using an inode picker process, that receives changed blocks from the scanner the source picks out inodes from changed blocks specifically associated with the selected qtree (or other sub-organization of the volume). Para. 82, “the tree may in fact contain several changed branches, is requiring the worker (in fact, the above-described scanner 820 process) to traverse each of the branches in a recursive manner until all changes are identified.”. Thus, the deltas are associated with blocks of the changed files identified via traversal of the blocks using the altered inodes.).
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Mathew such that the trees of blocks such as the hierarchical structure of Federwisch can be implemented in the environment of Mathew in order to compute deltas by using the identified altered inodes of Federwisch from changed blocks as suggested by Federwisch (Para. 72). The picker process looks for versions of inodes that have changed between the two snapshots and picks out the changed version (Federwisch, Para. 15). The identified changed data block is then transferred over the network to become part of the changed volume snapshot set at the destination as a changed block. One of the ordinary skills in the art would have motivated to make this modification in order to allow updates to be smaller and faster by transferring only incremental changes in data rather than a complete file system snapshot to the destination as suggested by Federwisch (Para. 64).
Combination of Mathew and Federwisch do not explicitly disclose maintain a replay list for files failing replication, wherein the failed files are replicated during a subsequent replication cycle.
However, in the same filed of endeavor, Diederich discloses maintain a replay list for files failing replication, wherein the failed files are replicated during a subsequent replication cycle (Para. 18, “to help ensure that failed files are properly backed up, replicated, deleted, and/or modified, failed files can be recorded by respective incremental data analysis processes 108a-n in one or more fault lists (e.g., a fault backup list, a fault replication list, a fault deletion list, and a fault modification list) for addition to the top of the incremental candidate list. In this manner, the appropriate operations can again be performed for those failed files prior to performing operations on the remaining files in the incremental candidate list”, where a fault replication list represents files failing replication. Para. 20, one or more of incremental data analysis processes 108a-n can create a fault backup list and a fault replication list, i.e., maintaining a replay list for files failing replication, where the fault backup list includes information identifying failed files that were not were not successfully backed up to backup storage pool 112, and the fault replication list includes information identifying failed files that were not successfully replicated to replication storage pool 114. By identifying and adding failed files to the fault lists, incremental data analysis processes 108a-n can add failed files to the incremental candidate list for a subsequent operations, i.e., a subsequent replication cycle. Thus, the one or more of incremental data analysis processes maintaining a replay list for files failing replication, wherein the failed files are replicated during a subsequent replication cycle.).
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Diederich into the combined method of Mathew and Federwisch by creating a fault replication list such as a replay list for files failing replication by using the diff file of Mathew during the failure of the system in order to help ensure that failed files are properly replicated prior to performing operations on the remaining files in the incremental candidate list as suggested by Diederich (Para. 18). One or more of incremental data analysis processes can create a fault replication list, where the fault replication list includes information identifying failed files that were not successfully replicated to replication storage pool (Para. 20). File analysis program performs a single, shared scan phase of file system and generates shared scan output that can be used for incremental data analysis processes and their respective operations (Para. 26). One of the ordinary skills in the art would have motivated to make this modification in order to reduce the amount of time and computing resources that would otherwise be consumed by performing separate walkthroughs of a file system during separate scan phases by using a shared scan output for various file-level incremental data analysis systems as suggested by Diederich (Para. 0008).
5. Claims 2, 4-5, 7-9, 12, 14-15, 17-19 and 22, 24-25, 27-29 are rejected under 35 U.S.C. 103 as being unpatentable over Mathew, Federwisch and Diederich as applied above, in view of Botelho et al. (previously presented) (US 2022/0374519 A1) hereinafter Botelho.
As to claims 2, 12, and 22, the claims are rejected for the same reasons as claims 1, 11, and 21 above. Mathew, Federwisch and Diederich do not explicitly disclose wherein the metadata relates to one of modification timestamps of the files or file length.
However, in the same field of endeavor, Botelho discloses wherein the metadata relates to one of modification timestamps of the files or file length (Para. 103, “The metadata for a first chunk of the one or more chunks may include information specifying a version of the virtual machine associated with the frozen copy, a time associated with the version (e.g., the snapshot of the virtual machine was taken at 5:30 p.m. on Jun. 29, 2018), and a file path to where the first chunk is stored within the distributed file system 112 (e.g., the first chunk is located at /snapshotsNM_B/sl/sl. Chunk1).”. Para. 106, “The metadata may also include a name of a file, the size of the file, the last time at which the file was modified, and a content checksum for the file. Each file that has been added, deleted, or modified since a previous snapshot was captured may be determined using the metadata (e.g., by comparing the time at which a file was last modified with a time associated with the previous snapshot).”. Thus, the metadata relates to one of modification timestamps of the files or file length.)
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Botelho into the combined method of Mathew, Federwisch and Diederich by including the time associated with the snapshot as metadata such that the modified version of snapshot can be retrieved in order to recover data for the failed node in the distributed file system as suggested by Botelho (Para. 103). The different snapshots in different point in times are being taken for identifying changes between nodes in case of node failure so that failed node can fill the missing data using the metadata which include modification timestamps of the files. One of the ordinary skills in the art would have motivated to make this modification in order to predict anomalies faster by primarily ingesting the file change statistics in the file system metadata of Mathew since the last snapshot, rather than the full snapshot itself which is relatively lightweight and highly scalable as suggested by Botelho (Para. 233).
As to claims 4, 14, and 24, the claims are rejected for the same reasons as claims 1, 11, and 21 above. In addition, Botelho discloses wherein replicating further comprises spanning the data across parallel synchronization threads of the source FS site by partitioning the data into predetermined chunks (Para. 94, “Each file stored in the distributed file system 112 may be partitioned into one or more chunks or shards. Each of the one or more chunks may be stored within the distributed file system 112 as a separate file. The files stored within the distributed file system 112 may be replicated or mirrored over a plurality of physical machines, thereby creating a load-balanced and fault tolerant distributed file system.”. Para. 317, “The job engines 5606 of multiple DMS nodes 5514 may generate the snapshots of the machines of the application in parallel (e.g., as defined by the shared start time for the jobs) by capturing data from the compute infrastructure 5502 to generate a synchronous snapshot of the application. As the needed resources for each of the fetch jobs has been allocated, and each of the job engines 5606 has retrieved a respective job of the application for execution, the snapshots of the machines are synchronized.”. Thus, replicating further comprises spanning the data across parallel synchronization threads of the source FS site by partitioning the data into predetermined chunks. Thus, replicating further comprises spanning the data across parallel synchronization threads of the source FS site by partitioning the data into predetermined chunks.).
As to claims 5, 15, and 25, the claims are rejected for the same reasons as claims 1, 11, and 21 above. In addition, Botelho discloses further comprising reusing the computed deltas to replicate the data across parallel synchronization threads from a portion of the source FS over the network to another target FS site (Para. 94, “Each file stored in the distributed file system 112 may be partitioned into one or more chunks or shards. Each of the one or more chunks may be stored within the distributed file system 112 as a separate file. The files stored within the distributed file system 112 may be replicated or mirrored over a plurality of physical machines, thereby creating a load-balanced and fault tolerant distributed file system.”. Para. 317, “In response to determining that each of the data fetch jobs is ready for execution, at operation 6035 the DMS cluster 5512 (e.g., the job engines 5606 of multiple DMS nodes 5514) executes the data fetch jobs to generate snapshots of the set of machines. The job engines 5606 of multiple DMS nodes 5514 may generate the snapshots of the machines of the application in parallel (e.g., as defined by the shared start time for the jobs) by capturing data from the compute infrastructure 5502 to generate a synchronous snapshot of the application. Each job engine 5606 may freeze a machine and take the snapshot of the machine, transferring the snapshot (or the incremental differences), and release the machine. As the needed resources for each of the fetch jobs has been allocated, and each of the job engines 5606 has retrieved a respective job of the application for execution, the snapshots of the machines are synchronized.”. Thus, reusing the computed deltas to replicate the data across parallel synchronization threads from a portion of the source FS over the network to another target FS site.).
As to claims 7, 17, and 27, the claims are rejected for the same reasons as claims 1, 11, and 21 above. In addition, Botelho discloses further comprising scanning a source directory from the base snapshot at the source FS site to generate a file list of the data based on mapping locations of the distributed share at the target FS site (Para. 105, “The virtual machine search index 106 may include a list of files that have been stored using a virtual machine and a version history for each of the files in the list. Each version of a file may be mapped to the earliest point-in-time snapshot of the virtual machine that includes the version of the file or to a snapshot of the virtual machine that includes the version of the file (e.g., the latest point-in-time snapshot of the virtual machine that includes the version of the file).”. Para. 106, for every file that has existed within any of the snapshots of the virtual machine, a virtual machine search index may be used, i.e., scanning a source directory, to identify when the file was first created (e.g., corresponding with a first version of the file) and at what times the file was modified (e.g., corresponding with subsequent versions of the file). Each version of the file may be mapped to a particular version of the virtual machine that stores that version of the file. Para. 96, “the one or more versions of the virtual machine may correspond with a plurality of files. The plurality of files may include a single full image snapshot of the virtual machine and one or more incremental aspects derived from the single full image snapshot.”. Thus, scanning a source directory from the base snapshot at the source FS site to generate a file list of the data based on mapping locations of the distributed share at the target FS site.).
As to claims 8, 18, and 28, the claims are rejected for the same reasons as claims 1, 11, and 21 above. In addition, Botelho discloses wherein replicating the data corresponding to the deltas of the changed files is screened by a file-oriented pathname filter (Para. 118, With reference to FIG. 8, a networked environment 800 includes a virtual machine (VM) 802, an ESX server 804, and a backup site 806. The ESX server 804 includes an I/O stack 808 and an I/O filter 810. The I/O filter 810 (also kuown as a replication filter, or plugin filter), i.e., a file-oriented pathname filter, may include a plugin filter driver to intercept I/Os for the purpose of caching and replication. Thus, replicating the data corresponding to the deltas of the changed files is screened by a file-oriented pathname filter.).
As to claims 9, 19, and 29, the claims are rejected for the same reasons as claims 1, 11, and 21 above. In addition, Botelho discloses wherein computing the deltas of the changed files comprises computing a modified directories/files list employed during incremental replication to replicate the data corresponding to the deltas of the changed files (Para. 111, After the base snapshot 402 is saved on a backup site 406, incremental snapshots 408 are taken periodically. A delta 410 between the two snapshots 402 and 408 represents data blocks that have changed, i.e., a modified directories/files list, and these blocks 412 may be sent to and stored on the backup site 406 for recovery when needed. Para. 96, “the distributed metadata store 110 may be used to manage one or more versions of a virtual machine. Each version of the virtual machine may correspond with a full image snapshot of the virtual machine stored within the distributed file system 112 or an incremental snapshot of the virtual machine (e.g., a forward incremental or reverse incremental) stored within the distributed file system 112. In one example, the one or more versions of the virtual machine may correspond with a plurality of files. The plurality of files may include a single full image snapshot of the virtual machine and one or more incremental aspects derived from the single full image snapshot.”. Thus, computing the deltas of one or more files comprises computing a modified directories/files list employed during incremental replication to replicate the data corresponding to the deltas of the changed files.).
6. Claims 10, 20 and 30 are rejected under 35 U.S.C. 103 as being unpatentable over Mathew, Federwisch, Diederich and Botelho as applied above, in view of PAWAR et al. (previously presented) (US 2016/0019317 A1) hereinafter PAWAR.
As to claims 10, 20 and 30, the claims are rejected for the same reasons as claims 9, 19, and 29 above. Combination of Mathew, Federwisch, Diederich and Botelho do not explicitly disclose further comprising applying customer-provided filters to the modified directories/files list to (i) mark directories to which the changed files belong and (ii) eliminate types of files excluded from replication.
However, in the same field of endeavor, PAWAR discloses further comprising applying customer-provided filters to the modified directories/files list to (i) mark directories to which the changed files belong and (ii) eliminate types of files excluded from replication (Para. 286, “The hypervisor 227 may provide an API to identify changed blocks for VMs 225. The system 200 can query the hypervisor 227 in order to obtain the changed blocks. In other embodiments, the system 200 can include a driver which marks used blocks in the volume 233 (e.g., as dirty) to identify potential changed blocks for incremental backup”. Para. 293, The image-level backup manager 250 may allow the user (e.g., system administrator) to create custom rules or filters, i.e., customer-provided filters, relating to which VM files 237 may or may not be removed. Criteria for rules can include file size, dates associated with the file 237, user associated with the file 237, file extension, file type, file application, etc. For example, the user may specify that files greater than certain size, files older than certain date, files belonging to certain users, etc. can be removed. Or the user may also specify that certain type of files, certain type of extensions, etc. may not be removed (e.g., due to compliance reasons), where user specified files indicate “mark directories to which the changed files belong”. The user can define filters i.e., customer-provided filters, that can exclude particular files from being removed from the information store 230, such as files having certain extensions or relating to certain subject matters (e.g., tax, audit, etc.). The rules and/or filters may be specified on the basis of each VM 255, a group of VMs 225, a volume 233, multiple volumes 233, a LUN 231, multiple LUNs 231, an information store 230, a group of information stores 230, the entire system 200, etc. Para. 294, “The image-level backup manager 250 can create a list of VM files 237 that are eligible candidates for removal from the information store 230. For example, the list, i.e., modified directories/files list, can include the file name, the file path in the information store 230, the VM 225 associated with the file 237, the LUN 231 or volume 233 associated with the VM 225, etc.”. Para. 301, “the image-level backup manager 250 flags file F1 287a as a candidate for removal from the information store 230 and includes it in the list of VM files 237 that can be removed. The list can identify information relating to file F1 237a corresponding to file F1 287a, such as the file path of the file F1 237a within the volume 233.”. Thus, applying customer-provided filters to the modified directories/files list to (i) mark directories to which the changed files belong and (ii) eliminate types of files excluded from replication.).
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of PAWAR into the combined method of Mathew, Federwisch, Diederich and Botelho by including the user defined filters such as the customer-provided filters in the environment of Mathew in order to identify list of files such as the changed files that can be removed before replication as disclosed by PAWAR (Para. 293-294). The system does not need to access each VM in order to determine which VM files can be removed from the primary storage by using the image-level backup. One of the ordinary skills in the art would have motivated to make this modification in order to save disk space in the primary storage and use resources more efficiently by removing infrequently or unused VM files from the primary storage by using the user defined filters as suggested by PAWAR (Para. 303).
Response to Arguments
7. Applicant’s arguments with respect to claims 1-2, 4-5, 7-12, 14-15, 17-22, 24-25, and 27-30 have been considered but are moot because of the new ground of rejection necessitated by the amendment to the claims. For Examiner's response, see discussion below:
Applicant's arguments, see pages 9-12, with respect to the rejections of claims 1-2, 4-5, 7-12, 14-15, 17-22, 24-25, and 27-30 under 35 USC §103 have been considered but are moot in view of the new ground(s) of rejection necessitated by applicant's amendments as set forth in the respective rejections of claims 1-2, 4-5, 7-12, 14-15, 17-22, 24-25, and 27-30 under 35 USC §103 above in view of the newly found reference.
Conclusion
8. The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Dave et al. (US 2017/0060702 A1) teaches file-based cluster-to cluster replication recovery.
9. THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMAD SOLAIMAN BHUYAN whose telephone number is (571)272-7843. The examiner can normally be reached on Monday - Friday 9:00am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Charles Rones can be reached on 571-272-4085. The fax phone number for the organization where this application or proceeding is assigned is 571 -273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MOHAMMAD S BHUYAN/Examiner, Art Unit 2168
/CHARLES RONES/Supervisory Patent Examiner, Art Unit 2168