DETAILED ACTION
This Office action is in response to Applicant’s reply filed 10/13/2025.
Claims 1-21 are pending.
Independent claims 1, 8, and 15 are amended.
Claims 1-21 are rejected.
Notice of AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
This application is a continuation of 17/664,144, a continuation of 17/068,221, and a continuation of provision application 63/092,347.
Examiner Notes/Objections
Underlined elements indicate newly amended limitations, newly incorporated references, or newly added text in general.
Claims 1, 8, and 15 are objected to for reciting “to generate locally merged files to generally first merged files” and will be interpreted as reciting “to generate locally merged files to generate first merged files,” further interpreted as reciting “to generate locally merged files [that are also] generated first merged files.”
Statutory Review under 35 USC § 101
Claims 1-7 are directed towards a method and have been reviewed.
Claims 1-7 remain not directed to patent-eligible subject matter as they recite an abstract idea that is not integrated into a practical application and does not recite additional elements that amount to significantly more than the recited judicial exception.
Claims 8-14 are directed toward an article of manufacture and have been reviewed.
Claims 8-14 remain directed to non-statutory subject matter as the term utilized can be interpreted to include transitory signals.
Further, claims 8-14 remain not directed to patent-eligible subject matter as they recite an abstract idea that is not integrated into a practical application and does not recite additional elements that amount to significantly more than the recited judicial exception.
Claims 15-21 are directed toward a system and have been reviewed.
Claims 15-21 initially appear to be statutory, as the system includes hardware (at least one hardware processor) as disclosed in ¶ 0067 of the applicant’s specification.
However, claims 15-21 remain not directed to patent-eligible subject matter as they recite an abstract idea that is not integrated into a practical application and does not recite additional elements that amount to significantly more than the recited judicial exception.
Response to Arguments
35 U.S.C. 103
Applicant’s arguments, see pp7-9, filed 10/13/2025, with respect to the rejection(s) of claim(s) 1-7, 8-14, and 15-21 under 35 U.S.C. 103 have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made under 35 U.S.C. 103 involving newly incorporated reference Finlow-Bates.
35 U.S.C. 101
Applicant's arguments filed 10/13/2025 have been fully considered but they are not persuasive.
Applicant argues (Remarks pp9-10) that the claims do not recite an abstract idea, referring to the claims being directed to solving the technological problem of low-latency data exportation in a network-based data system, offering a technological solution (of data exportation existing in the computer networking environment of large-scale data storage) to a technological problem.
In response to Applicant’s arguments, the Examiner maintains the interpretation that the claims do recite the abstract idea, and the Examiner further believes that the claims are not currently structured to reflect the technological problem and its associated solution.
The hashing techniques present in the claims are considered a mathematical concept at this time and are being considered part of the abstract idea.
While the Examiner does see potential merit in the cited portions of the specification ¶ 0051 involving generated export files being small and later downstream applications incurring a loss of performance in handling the small files, notably a first temporary format which can be easily serialized and deserialized (and later generating result files for a second final format of a given size) (“cleaning up the small files and avoiding downstream small file performance issues,”) the Examiner can more accurately consider these elements in the claims with adjustments to the claim language, such as adjustments to the independent claims or adjustments to the language of dependent claims 5-6 and incorporation in the independent claims.
As a result, the claims remain rejected under 35 U.S.C. 101 as being directed to patent-ineligible subject matter.
The Applicant also argues (Remarks pp10-11) that the claims recite a technological solution of data exportation using parallel processing with hashing at different levels by a plurality of nodes in a network-based data system, necessarily rooting the claims in computer technology.
In response to Applicant’s arguments, while the Examiner sees merit in the specifying that the first hash is a local hash and the second hash is a sub-divide hash, if the hashing were to be considered additional elements, at this time, the Examiner would consider them to be recited at a high level of generality; further detail on the hashing functions may be recommended.
As a result, the claims remain rejected under 35 U.S.C. 101 as being directed to patent-ineligible subject matter.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
(I)
Claims 1-7 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 2A, Prong One
Independent claim 1 recites hashing a first set of files at different levels by performing a first hash then performing a second hash, which is a mathematical concept.
Step 2A, Prong Two
This judicial exception of hashing a first set of files at different levels by performing a first hash then performing a second hash is not integrated into a practical application despite the following generically recited computer elements, which amount to implementing the abstract idea on a computer, merely using a computer as a tool to perform an abstract idea, or generally linking the use of a judicial exception to a particular technological environment or field of use as seen below.
performing, by a plurality of nodes in the network-based data system, a lower level projection;
performing higher level projection on the second merged files to generate a second set of files,
These additional elements are mere data gathering which is considered to be insignificant extra solution activity (MPEP 2106.05(g)).
Step 2B
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception despite the additional elements shown below:
receiving a command to export data
distributing the first set of files to the plurality of nodes;
This performs receiving or transmitting data over a network, which are well-understood, routine, conventional computer functions as recognized by the court decisions listed in MPEP § 2106.05(d), specifically MPEP § 2106.05(d)(II)(i).
data from a database stored in a network-based data system
unloading a first set of files to an intermediate storage internal to the network-based data system,
storing the second set of files in the external datastore.
These elements store and retrieve information in memory, which are well-understood, routine, conventional computer functions as recognized by the court decisions listed in MPEP § 2106.05(d).
performing, at each of the plurality of nodes, a local hash on the distributed first set of files using the partition-by parameter setting to generate locally merged files to generally first merged files, and
performing, by the plurality of nodes, a sub-divide hash across the first merged files from the plurality of nodes to generate second merged files;
This is merely a nominal or token extra-solution component of the claim and serves only to generally link the product of nature to a further particular technological environment (see MPEP 2106.05(h)).
Claim 2:
wherein the command includes an export file size for each result file of the export data to be exported to a plurality of partitions in the external data store.
This is merely a nominal or token extra-solution component of the claim and serves only to generally link the product of nature to a further particular technological environment (see MPEP 2106.05(h)).
Claim 3:
querying the database based on the partition-by parameter,
These additional elements are mere data gathering which is considered to be insignificant extra solution activity (MPEP 2106.05(g)).
Claim 4:
wherein hashing is performed
Hashing is a mathematical concept and thus comprises an abstract idea.
on external data store,
This is merely a nominal or token extra-solution component of the claim and serves only to generally link the product of nature to a further particular technological environment (see MPEP 2106.05(h)).
Claims 5-7:
These are merely nominal or token extra-solution components of the claim and generally links the product of nature to a further particular technological environment (see MPEP 2106.05(h)).
(II)
Claims 8-14 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 2A, Prong One
Independent claim 8 recites hashing a first set of files at different levels by performing a first hash then performing a second hash, which is a mathematical concept.
Step 2A, Prong Two
This judicial exception of hashing a first set of files at different levels by performing a first hash then performing a second hash is not integrated into a practical application despite the following generically recited computer elements, which amount to implementing the abstract idea on a computer, merely using a computer as a tool to perform an abstract idea, or generally linking the use of a judicial exception to a particular technological environment or field of use as seen below.
a machine
network-based data system
This additional element merely uses a computer as a tool to perform an abstract idea (see MPEP 2160.05(f)).
performing, by a plurality of nodes in the network-based data system, a lower level projection;
performing higher level projection on the second merged files to generate a second set of files,
These additional elements are mere data gathering which is considered to be insignificant extra solution activity (MPEP 2106.05(g)).
Step 2B
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception despite the additional elements shown below:
receiving a command to export data
distributing the first set of files to the plurality of nodes;
This performs receiving or transmitting data over a network, which are well-understood, routine, conventional computer functions as recognized by the court decisions listed in MPEP § 2106.05(d), specifically MPEP § 2106.05(d)(II)(i).
machine-storage medium embodying instructions
data from a database stored in a network-based data system
unloading a first set of files to an intermediate storage internal to the network-based data system,
storing the second set of files in the external datastore.
These elements store and retrieve information in memory, which are well-understood, routine, conventional computer functions as recognized by the court decisions listed in MPEP § 2106.05(d).
performing, at each of the plurality of nodes, a local hash on the distributed first set of files using the partition-by parameter setting to generate locally merged files to generally first merged files, and
performing, by the plurality of nodes, a sub-divide hash across the first merged files from the plurality of nodes to generate second merged files;
This is merely a nominal or token extra-solution component of the claim and serves only to generally link the product of nature to a further particular technological environment (see MPEP 2106.05(h)).
Claim 9:
wherein the command includes an export file size for each result file of the export data to be exported to a plurality of partitions in the external data store.
This is merely a nominal or token extra-solution component of the claim and serves only to generally link the product of nature to a further particular technological environment (see MPEP 2106.05(h)).
Claim 10:
querying the database based on the partition-by parameter,
These additional elements are mere data gathering which is considered to be insignificant extra solution activity (MPEP 2106.05(g)).
Claim 11:
wherein hashing is performed
Hashing is a mathematical concept and thus comprises an abstract idea.
on external data store,
This is merely a nominal or token extra-solution component of the claim and serves only to generally link the product of nature to a further particular technological environment (see MPEP 2106.05(h)).
Claims 12-14:
These are merely nominal or token extra-solution components of the claim and generally links the product of nature to a further particular technological environment (see MPEP 2106.05(h)).
(III)
Claims 15-21 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 2A, Prong One
Independent claim 15 recites hashing a first set of files at different levels by performing a first hash then performing a second hash, which is a mathematical concept.
Step 2A, Prong Two
This judicial exception of hashing a first set of files at different levels by performing a first hash then performing a second hash is not integrated into a practical application despite the following generically recited computer elements, which amount to implementing the abstract idea on a computer, merely using a computer as a tool to perform an abstract idea, or generally linking the use of a judicial exception to a particular technological environment or field of use as seen below.
at least one hardware processor;
network-based data system
This additional element merely uses a computer as a tool to perform an abstract idea (see MPEP 2160.05(f)).
performing, by a plurality of nodes in the network-based data system, a lower level projection;
performing higher level projection on the second merged files to generate a second set of files,
These additional elements are mere data gathering which is considered to be insignificant extra solution activity (MPEP 2106.05(g)).
Step 2B
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception despite the additional elements shown below:
receiving a command to export data
distributing the first set of files to the plurality of nodes;
This performs receiving or transmitting data over a network, which are well-understood, routine, conventional computer functions as recognized by the court decisions listed in MPEP § 2106.05(d), specifically MPEP § 2106.05(d)(II)(i).
at least one memory storing instructions
data from a database stored in a network-based data system
unloading a first set of files to an intermediate storage internal to the network-based data system,
storing the second set of files in the external datastore.
These elements store and retrieve information in memory, which are well-understood, routine, conventional computer functions as recognized by the court decisions listed in MPEP § 2106.05(d).
performing, at each of the plurality of nodes, a local hash on the distributed first set of files using the partition-by parameter setting to generate locally merged files to generally first merged files, and
performing, by the plurality of nodes, a sub-divide hash across the first merged files from the plurality of nodes to generate second merged files;
This is merely a nominal or token extra-solution component of the claim and serves only to generally link the product of nature to a further particular technological environment (see MPEP 2106.05(h)).
Claim 16:
wherein the command includes an export file size for each result file of the export data to be exported to a plurality of partitions in the external data store.
This is merely a nominal or token extra-solution component of the claim and serves only to generally link the product of nature to a further particular technological environment (see MPEP 2106.05(h)).
Claim 17:
querying the database based on the partition-by parameter,
These additional elements are mere data gathering which is considered to be insignificant extra solution activity (MPEP 2106.05(g)).
Claim 18:
wherein hashing is performed
Hashing is a mathematical concept and thus comprises an abstract idea.
on external data store,
This is merely a nominal or token extra-solution component of the claim and serves only to generally link the product of nature to a further particular technological environment (see MPEP 2106.05(h)).
Claims 19-21:
These are merely nominal or token extra-solution components of the claim and generally links the product of nature to a further particular technological environment (see MPEP 2106.05(h)).
(IV)
Claims 8-14 remain rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The term utilized can be interpreted to include transitory signals.
Official Gazette Notice 1351 OG 212, dated February 23, 2010, states "the broadest reasonable interpretation of a claim drawn to a computer readable medium...typically covers forms of non-transitory tangible media and transitory propagating signals per se in view of the ordinary and customary meaning of computer readable media."
"A transitory, propagating signal ... is not a 'process, machine, manufacture, or composition of matter.' Those four categories define the explicit scope and reach of subject matter patentable under 35 U.S.C. § 101; thus, such a signal cannot be patentable subject matter." In re Nuijten, 84 USPQ2d 1495, 1503 (Fed. Cir. 2007).
Because the full scope of the claim encompasses non-statutory subject matter (i.e., transitory propagating signals), the claim as a whole is non-statutory. The Examiner suggests adding the limitation "non-transitory" to the claims in question to limit the claim scope to encompass only statutory subject matter. Appropriate correction is required.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-2, 8-9, and 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over Maccanti et al., U.S. Patent No. 9,632,878 (published April 25, 2017; hereinafter Maccanti) in view of Mihalcea et al., U.S. Patent Application Publication No. 2016/0321033 (hereinafter Mihalcea) in further view of Finlow-Bates, U.S. Patent Application Publication No. 2016/0134601 (hereinafter Finlow-Bates).
Regarding claim 1, Maccanti teaches:
A method comprising: receiving a command to export data from a database stored in a network-based data system to an external datastore, (Maccanti FIG. 7, col. 26, lines 36-57: As illustrated at 710, in this example, the method may include receiving a request to back up a table ... The method may include beginning to back up each of the partitions in the partition set (independently) and to export them (e.g., to export copies of each of them) to a remote storage system; Maccanti FIG. 9, col. 30, line 46-col. 31, line 7: in response to a request to back up the given table, table data (e.g., table data for one of the partitions of the given table) may be exported and the exported table data may be packaged (as in 920); see also Maccanti col. 9, lines 1-24 regarding this being a network-based data system: storage service clients 310a-310n may encompass any type of client configurable to submit web services requests to Web services platform 330 via network 320)
the command including a partition-by parameter setting for structuring files in the external datastore and a final file type of the files; (Maccanti col. 17, lines 26-60: The control plane APIs provided by the data storage service (and/or the underlying system) may be used to manipulate table-level entities, such as tables and indexes and/or to re-configure various tables (e.g., in response to the findings presented in a skew report or in response to changes in various table or partition configuration parameters specified in a request to perform a restore operation); see also Maccanti col. 32, line 56-col. 33, line 26: the distributed data storage system may support an option to restore a table from a backup with different configuration parameters than those associated with the table from which the backup was created ... a request to restore a table from backup may include a new configuration parameter value for provisioned throughput capacity (e.g., in terms of IOPS for read and/or write operations) or for provisioned storage capacity, or may indicate a change in the indexing for the table (e.g., with modified or additional secondary indexes) ... the automatically triggered repartitioning operations made be performed later; see Maccanti col. 30, line 46-col. 31, line 7 regarding final type: the exported table data (e.g., an exported copy of the partition) may be re-formatted for compliance with an archiving format)
performing, by a plurality of nodes in the network-based data system, a lower level projection; (Maccanti addresses "lower level projection" based on instant specification ¶ 0051; see Maccanti FIG. 9, col. 30, line 46-col. 31, line 7: in response to a request to back up the given table, table data (e.g., table data for one of the partitions of the given table) may be exported and the exported table data may be packaged (as in 920))
unloading a first set of files to an intermediate storage internal to the network-based data system, the first set of files being in an intermediate file type; (Maccanti FIG. 9, col. 30, line 46-col. 31, line 7: table data (e.g., table data for one of the partitions of the given table) may be exported and the exported table data may be packaged (as in 920). For example, the exported table data (e.g., an exported copy of the partition) may be re-formatted for compliance with an archiving format. As illustrated in this example, the packaged table data (e.g., the packaged copy of the partition) may then be compressed (as in 925), with or without encryption, and buffered (as in 930) [shows the unloading to intermediate storage], at least until the correctness of the packaging and/or compression processes have been verified)
hashing the first set of files at different levels using the partition-by parameter setting; (Maccanti shows hashing in FIG. 9, col. 31, lines 8-48: verifying that the packaged and compressed table data is uncorrupted and/or is otherwise usable in restoring the table data may include uncompressing the table data (as in 940), unpackaging the uncompressed table data to return it to its previous format (as in 945), and generating a checksum for the uncompressed, unpackaged table data (as in 950); Maccanti shows hashing using the claimed 'partition-by parameter setting' from earlier through col. 6, line 40-col. 7, line 7: the metadata that is uploaded to the remote storage system as part of a backup operation may also include ... a checksum for each partition (e.g., a checksum generated according to the MD5 message digest algorithm), the BackupID for the backup, and/or any other information that may be usable in a subsequent operation to restore the table and/or to verify the consistency of the restored table)
…
transmitting the export files to the external datastore. (Maccanti FIG. 8, col. 28, line 63-col. 29, line 37: once the exported, packaged, and compressed partition data has been verified (shown as the positive exit from 840), the method may include uploading customer data in the partition to a remote storage system, as in 850, and storing partition-related configuration information in the remote storage system, as in 860; Maccanti FIG. 9, col. 31, lines 23-48: the data storage system may continue the backup operation by uploading the table data to remote storage system 935 (which, in some embodiments, may be a remote key-value storage system). As illustrated in this example, if the verification is successful, an indication to that effect may be sent to the buffering operation/component to enable and/or initiate the uploading operation)
Maccanti does not expressly disclose:
distributing the first set of files to the plurality of nodes;
the hashing the first set of files at different levels comprising:
performing, at each of the plurality of nodes, a local hash on the distributed first set of files using the partition-by parameter setting to generate locally merged files to generally first merged files, and
performing, by the plurality of nodes, a sub-divide hash across the first merged files from the plurality of nodes to generate second merged files;
performing higher level projection on the second merged files to generate a second set of files, the second set of files being in the final file type;
However, Mihalcea addresses this by teaching:
performing higher level projection on the … files to generate a second set of files, the second set of files being in the final file type; (Mihalcea ¶ 0040-0041: build synchronizer 108 may project a local image file that is encoded in a first file format (e.g., JPEG) to a remote image file that is encoded in a second file format (e.g., PNG). This may be carried out, for example, if JPEG files render successfully in the first context but do not render successfully in the second context, such that a different encoding format may be desired. Such projection may entail automatically creating a remote asset that has a different form and/or a different content than the local asset. Such projection may also entail selectively modifying one or more portions of the local asset to produce the remote asset)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the data exporting and formatting of Maccanti with the data formatting of Mihalcea.
In addition, both of the references (Maccanti and Mihalcea) disclose features that are directed to analogous art, and they are directed to the same field of endeavor, such as data format conversion techniques.
Motivation to do so would be the teaching, suggestion, or motivation for one of ordinary skill in the art to implement consist software across different platforms/contexts and also to reduce workload as seen in Mihalcea ¶ 0040.
Motivation to do so would also be to improve the functioning of the data reformatting of Maccanti with the similar data formatting of Mihalcea but with the improved ability to ensure data is accessible at a new/second location.
Maccanti in view of Mihalcea does not expressly disclose:
distributing the first set of files to the plurality of nodes;
the hashing the first set of files at different levels comprising:
performing, at each of the plurality of nodes, a local hash on the distributed first set of files using the partition-by parameter setting to generate locally merged files to generally first merged files, and
performing, by the plurality of nodes, a sub-divide hash across the first merged files from the plurality of nodes to generate second merged files;
performing … on the second merged files to generate a second set of files,
However, Finlow-Bates addresses this by teaching:
distributing the first set of files to the plurality of nodes; (Finlow-Bates FIG. 1, ¶ 0033: The system 100 may include various computing devices 110, 120, 130, 140 connected via a network 190; Finlow-Bates FIG. 1, ¶ 0037: computing devices 110, 120, 130, 140 may each participate in a file-sharing service (e.g., execute a common file-sharing application) such that each may query filenames and download various files from one another. The fourth computing device 140 may be configured to store various files 154a-154n available for sharing with the other computing devices 110, 120, 130 via the network 190)
the hashing the first set of files at different levels comprising: performing, at each of the plurality of nodes, a local hash on the distributed first set of files using the partition-by parameter setting to generate locally merged files to generally first merged files, and (Finlow-Bates FIG. 1, ¶ 0038-0039: The first computing device 110 may be configured with the first shuffle algorithm 152 and a second hash function 180 (e.g., “Hash_2” in FIG. 1) [shows local hash] ... The second computing device 120 may be configured with a second shuffle algorithm 182 (e.g., “Shuffle Alg_2” in FIG. 1) and the shared first hash function 150 ... The third computing device 130 may be configured with both the shared first hash function 150 [also shows local hash] and the first shuffle algorithm 152; see also Finlow-Bates discussing the shared hash function in ¶ 0035: a file may be un-shuffled by a recipient computing device upon receipt from a sender device using a shared hash function applied to a filename for the file and a shuffling algorithm shared by the sender and recipient computing devices [relevant to the partition-by parameter setting])
performing, by the plurality of nodes, a sub-divide hash across the first merged files from the plurality of nodes to generate second merged files; (Finlow-Bates ¶ 0044-0046: In a first iteration 210 of such a shuffling algorithm, the computing device may divide the original file 202 into two equal-sized subdivisions or segments 212, 213 (e.g., two 512 KB subdivisions) [shows sub-divide hash] and may evaluate the first bit of the hash output 204 ... In a second iteration 220 of the shuffling algorithm, the computing device may subdivide (e.g., halve) each of the two segments 212, 213 to make four segments 222, 223, 224, 225 (e.g., four 256 KB subdivisions), and may evaluate the second bit of the hash output 204 (i.e., index ‘1’ of the hash output 204) with the third and fourth segments 222, 223 and the third bit of the hash output 204 (i.e., index ‘2’ of the hash output 204) with the fifth and sixth segments 224, 225; see Finlow-Bates ¶ 0059: The shuffle algorithm may use various techniques for programmatically shuffling, un-sorting, scrambling, and/or otherwise reordering data segments [shuffling shows relevance to the claimed hashing])
performing … on the second merged files to generate a second set of files, (Finlow-Bates FIG. 3B, ¶ 0066-0068: In block 360, the processor of the computing device may un-shuffle the plurality of data segments using the shared, looping reverse-shuffle algorithm (i.e., executing a shuffle algorithm in a reverse manner) … In optional block 366, the processor of the computing device may store the un-shuffled digital file with the public filename, such as by saving the un-shuffled (or original) file to an internal storage device, a remote database, a connected storage drive or device (e.g., external hard drive), etc.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the data exporting and formatting of Maccanti as modified with the data sharing and hashing and shuffling of Finlow-Bates.
In addition, both of the references (Maccanti as modified and Finlow-Bates) disclose features that are directed to analogous art, and they are directed to the same field of endeavor, such as data conversion techniques.
Motivation to do so would be the teaching, suggestion, or motivation for one of ordinary skill in the art to implement improved security when sharing or locally storing files as seen in Finlow-Bates ¶ 0023.
Motivation to do so would also be to improve the functioning of the data conversion of Maccanti as modified with the similar data conversion of Finlow-Bates but with the improved ability to utilize shared hashing and shuffling algorithms.
Regarding claim 8, Maccanti teaches:
A machine-storage medium embodying instructions that, when executed by a machine, cause the machine to perform operations comprising: (Maccanti col. 51, line 61-col. 52, line 21: Any or all of program instructions 1925 may be provided as a computer program product, or software, that may include a non-transitory computer-readable storage medium having stored thereon instructions, which may be used to program a computer system (or other electronic devices) to perform a process according to various embodiments)
receiving a command to export data from a database stored in a network-based data system to an external datastore, (Maccanti FIG. 7, col. 26, lines 36-57: As illustrated at 710, in this example, the method may include receiving a request to back up a table ... The method may include beginning to back up each of the partitions in the partition set (independently) and to export them (e.g., to export copies of each of them) to a remote storage system; Maccanti FIG. 9, col. 30, line 46-col. 31, line 7: in response to a request to back up the given table, table data (e.g., table data for one of the partitions of the given table) may be exported and the exported table data may be packaged (as in 920); see also Maccanti col. 9, lines 1-24 regarding this being a network-based data system: storage service clients 310a-310n may encompass any type of client configurable to submit web services requests to Web services platform 330 via network 320)
the command including a partition-by parameter setting for structuring files in the external datastore and a final file type of the files; (Maccanti col. 17, lines 26-60: The control plane APIs provided by the data storage service (and/or the underlying system) may be used to manipulate table-level entities, such as tables and indexes and/or to re-configure various tables (e.g., in response to the findings presented in a skew report or in response to changes in various table or partition configuration parameters specified in a request to perform a restore operation); see also Maccanti col. 32, line 56-col. 33, line 26: the distributed data storage system may support an option to restore a table from a backup with different configuration parameters than those associated with the table from which the backup was created ... a request to restore a table from backup may include a new configuration parameter value for provisioned throughput capacity (e.g., in terms of IOPS for read and/or write operations) or for provisioned storage capacity, or may indicate a change in the indexing for the table (e.g., with modified or additional secondary indexes) ... the automatically triggered repartitioning operations made be performed later; see Maccanti col. 30, line 46-col. 31, line 7 regarding final type: the exported table data (e.g., an exported copy of the partition) may be re-formatted for compliance with an archiving format)
performing, by a plurality of nodes in the network-based data system, a lower level projection; (Maccanti addresses "lower level projection" based on instant specification ¶ 0051; see Maccanti FIG. 9, col. 30, line 46-col. 31, line 7: in response to a request to back up the given table, table data (e.g., table data for one of the partitions of the given table) may be exported and the exported table data may be packaged (as in 920))
unloading a first set of files to an intermediate storage internal to the network-based data system, the first set of files being in an intermediate file type; (Maccanti FIG. 9, col. 30, line 46-col. 31, line 7: table data (e.g., table data for one of the partitions of the given table) may be exported and the exported table data may be packaged (as in 920). For example, the exported table data (e.g., an exported copy of the partition) may be re-formatted for compliance with an archiving format. As illustrated in this example, the packaged table data (e.g., the packaged copy of the partition) may then be compressed (as in 925), with or without encryption, and buffered (as in 930) [shows the unloading to intermediate storage], at least until the correctness of the packaging and/or compression processes have been verified)
hashing the first set of files at different levels using the partition-by parameter setting; (Maccanti shows hashing in FIG. 9, col. 31, lines 8-48: verifying that the packaged and compressed table data is uncorrupted and/or is otherwise usable in restoring the table data may include uncompressing the table data (as in 940), unpackaging the uncompressed table data to return it to its previous format (as in 945), and generating a checksum for the uncompressed, unpackaged table data (as in 950); Maccanti shows hashing using the claimed 'partition-by parameter setting' from earlier through col. 6, line 40-col. 7, line 7: the metadata that is uploaded to the remote storage system as part of a backup operation may also include ... a checksum for each partition (e.g., a checksum generated according to the MD5 message digest algorithm), the BackupID for the backup, and/or any other information that may be usable in a subsequent operation to restore the table and/or to verify the consistency of the restored table)
…
transmitting the export files to the external datastore. (Maccanti FIG. 8, col. 28, line 63-col. 29, line 37: once the exported, packaged, and compressed partition data has been verified (shown as the positive exit from 840), the method may include uploading customer data in the partition to a remote storage system, as in 850, and storing partition-related configuration information in the remote storage system, as in 860; Maccanti FIG. 9, col. 31, lines 23-48: the data storage system may continue the backup operation by uploading the table data to remote storage system 935 (which, in some embodiments, may be a remote key-value storage system). As illustrated in this example, if the verification is successful, an indication to that effect may be sent to the buffering operation/component to enable and/or initiate the uploading operation)
Maccanti does not expressly disclose:
distributing the first set of files to the plurality of nodes;
the hashing the first set of files at different levels comprising:
performing, at each of the plurality of nodes, a local hash on the distributed first set of files using the partition-by parameter setting to generate locally merged files to generally first merged files, and
performing, by the plurality of nodes, a sub-divide hash across the first merged files from the plurality of nodes to generate second merged files;
performing higher level projection on the second merged files to generate a second set of files, the second set of files being in the final file type;
However, Mihalcea addresses this by teaching:
performing higher level projection on the … files to generate a second set of files, the second set of files being in the final file type; (Mihalcea ¶ 0040-0041: build synchronizer 108 may project a local image file that is encoded in a first file format (e.g., JPEG) to a remote image file that is encoded in a second file format (e.g., PNG). This may be carried out, for example, if JPEG files render successfully in the first context but do not render successfully in the second context, such that a different encoding format may be desired. Such projection may entail automatically creating a remote asset that has a different form and/or a different content than the local asset. Such projection may also entail selectively modifying one or more portions of the local asset to produce the remote asset)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the data exporting and formatting of Maccanti with the data formatting of Mihalcea.
In addition, both of the references (Maccanti and Mihalcea) disclose features that are directed to analogous art, and they are directed to the same field of endeavor, such as data format conversion techniques.
Motivation to do so would be the teaching, suggestion, or motivation for one of ordinary skill in the art to implement consist software across different platforms/contexts and also to reduce workload as seen in Mihalcea ¶ 0040.
Motivation to do so would also be to improve the functioning of the data reformatting of Maccanti with the similar data formatting of Mihalcea but with the improved ability to ensure data is accessible at a new/second location.
Maccanti in view of Mihalcea does not expressly disclose:
distributing the first set of files to the plurality of nodes;
the hashing the first set of files at different levels comprising:
performing, at each of the plurality of nodes, a local hash on the distributed first set of files using the partition-by parameter setting to generate locally merged files to generally first merged files, and
performing, by the plurality of nodes, a sub-divide hash across the first merged files from the plurality of nodes to generate second merged files;
performing … on the second merged files to generate a second set of files,
However, Finlow-Bates addresses this by teaching:
distributing the first set of files to the plurality of nodes; (Finlow-Bates FIG. 1, ¶ 0033: The system 100 may include various computing devices 110, 120, 130, 140 connected via a network 190; Finlow-Bates FIG. 1, ¶ 0037: computing devices 110, 120, 130, 140 may each participate in a file-sharing service (e.g., execute a common file-sharing application) such that each may query filenames and download various files from one another. The fourth computing device 140 may be configured to store various files 154a-154n available for sharing with the other computing devices 110, 120, 130 via the network 190)
the hashing the first set of files at different levels comprising: performing, at each of the plurality of nodes, a local hash on the distributed first set of files using the partition-by parameter setting to generate locally merged files to generally first merged files, and (Finlow-Bates FIG. 1, ¶ 0038-0039: The first computing device 110 may be configured with the first shuffle algorithm 152 and a second hash function 180 (e.g., “Hash_2” in FIG. 1) [shows local hash] ... The second computing device 120 may be configured with a second shuffle algorithm 182 (e.g., “Shuffle Alg_2” in FIG. 1) and the shared first hash function 150 ... The third computing device 130 may be configured with both the shared first hash function 150 [also shows local hash] and the first shuffle algorithm 152; see also Finlow-Bates discussing the shared hash function in ¶ 0035: a file may be un-shuffled by a recipient computing device upon receipt from a sender device using a shared hash function applied to a filename for the file and a shuffling algorithm shared by the sender and recipient computing devices [relevant to the partition-by parameter setting])
performing, by the plurality of nodes, a sub-divide hash across the first merged files from the plurality of nodes to generate second merged files; (Finlow-Bates ¶ 0044-0046: In a first iteration 210 of such a shuffling algorithm, the computing device may divide the original file 202 into two equal-sized subdivisions or segments 212, 213 (e.g., two 512 KB subdivisions) [shows sub-divide hash] and may evaluate the first bit of the hash output 204 ... In a second iteration 220 of the shuffling algorithm, the computing device may subdivide (e.g., halve) each of the two segments 212, 213 to make four segments 222, 223, 224, 225 (e.g., four 256 KB subdivisions), and may evaluate the second bit of the hash output 204 (i.e., index ‘1’ of the hash output 204) with the third and fourth segments 222, 223 and the third bit of the hash output 204 (i.e., index ‘2’ of the hash output 204) with the fifth and sixth segments 224, 225; see Finlow-Bates ¶ 0059: The shuffle algorithm may use various techniques for programmatically shuffling, un-sorting, scrambling, and/or otherwise reordering data segments [shuffling shows relevance to the claimed hashing])
performing … on the second merged files to generate a second set of files, (Finlow-Bates FIG. 3B, ¶ 0066-0068: In block 360, the processor of the computing device may un-shuffle the plurality of data segments using the shared, looping reverse-shuffle algorithm (i.e., executing a shuffle algorithm in a reverse manner) … In optional block 366, the processor of the computing device may store the un-shuffled digital file with the public filename, such as by saving the un-shuffled (or original) file to an internal storage device, a remote database, a connected storage drive or device (e.g., external hard drive), etc.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the data exporting and formatting of Maccanti as modified with the data sharing and hashing and shuffling of Finlow-Bates.
In addition, both of the references (Maccanti as modified and Finlow-Bates) disclose features that are directed to analogous art, and they are directed to the same field of endeavor, such as data conversion techniques.
Motivation to do so would be the teaching, suggestion, or motivation for one of ordinary skill in the art to implement improved security when sharing or locally storing files as seen in Finlow-Bates ¶ 0023.
Motivation to do so would also be to improve the functioning of the data conversion of Maccanti as modified with the similar data conversion of Finlow-Bates but with the improved ability to utilize shared hashing and shuffling algorithms.
Regarding claim 15, Maccanti teaches:
A system comprising: at least one hardware processor; and at least one memory storing instructions that, when executed by the one or more processors, cause the at least one hardware processor to perform operations comprising: (Maccanti col. 52, lines 22-34: program instructions and/or data as described herein for implementing a data storage service that employs the techniques described above may be received, sent or stored upon different types of computer-readable media or on similar media separate from system memory 1920 or computing node 1900. Program instructions and data stored on a computer-readable storage medium may be transmitted to a computing node 1900 for execution by a processor 1910)
receiving a command to export data from a database stored in a network-based data system to an external datastore, (Maccanti FIG. 7, col. 26, lines 36-57: As illustrated at 710, in this example, the method may include receiving a request to back up a table ... The method may include beginning to back up each of the partitions in the partition set (independently) and to export them (e.g., to export copies of each of them) to a remote storage system; Maccanti FIG. 9, col. 30, line 46-col. 31, line 7: in response to a request to back up the given table, table data (e.g., table data for one of the partitions of the given table) may be exported and the exported table data may be packaged (as in 920); see also Maccanti col. 9, lines 1-24 regarding this being a network-based data system: storage service clients 310a-310n may encompass any type of client configurable to submit web services requests to Web services platform 330 via network 320)
the command including a partition-by parameter setting for structuring files in the external datastore and a final file type of the files; (Maccanti col. 17, lines 26-60: The control plane APIs provided by the data storage service (and/or the underlying system) may be used to manipulate table-level entities, such as tables and indexes and/or to re-configure various tables (e.g., in response to the findings presented in a skew report or in response to changes in various table or partition configuration parameters specified in a request to perform a restore operation); see also Maccanti col. 32, line 56-col. 33, line 26: the distributed data storage system may support an option to restore a table from a backup with different configuration parameters than those associated with the table from which the backup was created ... a request to restore a table from backup may include a new configuration parameter value for provisioned throughput capacity (e.g., in terms of IOPS for read and/or write operations) or for provisioned storage capacity, or may indicate a change in the indexing for the table (e.g., with modified or additional secondary indexes) ... the automatically triggered repartitioning operations made be performed later; see Maccanti col. 30, line 46-col. 31, line 7 regarding final type: the exported table data (e.g., an exported copy of the partition) may be re-formatted for compliance with an archiving format)
performing, by a plurality of nodes in the network-based data system, a lower level projection; (Maccanti addresses "lower level projection" based on instant specification ¶ 0051; see Maccanti FIG. 9, col. 30, line 46-col. 31, line 7: in response to a request to back up the given table, table data (e.g., table data for one of the partitions of the given table) may be exported and the exported table data may be packaged (as in 920))
unloading a first set of files to an intermediate storage internal to the network-based data system, the first set of files being in an intermediate file type; (Maccanti FIG. 9, col. 30, line 46-col. 31, line 7: table data (e.g., table data for one of the partitions of the given table) may be exported and the exported table data may be packaged (as in 920). For example, the exported table data (e.g., an exported copy of the partition) may be re-formatted for compliance with an archiving format. As illustrated in this example, the packaged table data (e.g., the packaged copy of the partition) may then be compressed (as in 925), with or without encryption, and buffered (as in 930) [shows the unloading to intermediate storage], at least until the correctness of the packaging and/or compression processes have been verified)
hashing the first set of files at different levels using the partition-by parameter setting; (Maccanti shows hashing in FIG. 9, col. 31, lines 8-48: verifying that the packaged and compressed table data is uncorrupted and/or is otherwise usable in restoring the table data may include uncompressing the table data (as in 940), unpackaging the uncompressed table data to return it to its previous format (as in 945), and generating a checksum for the uncompressed, unpackaged table data (as in 950); Maccanti shows hashing using the claimed 'partition-by parameter setting' from earlier through col. 6, line 40-col. 7, line 7: the metadata that is uploaded to the remote storage system as part of a backup operation may also include ... a checksum for each partition (e.g., a checksum generated according to the MD5 message digest algorithm), the BackupID for the backup, and/or any other information that may be usable in a subsequent operation to restore the table and/or to verify the consistency of the restored table)
…
transmitting the export files to the external datastore. (Maccanti FIG. 8, col. 28, line 63-col. 29, line 37: once the exported, packaged, and compressed partition data has been verified (shown as the positive exit from 840), the method may include uploading customer data in the partition to a remote storage system, as in 850, and storing partition-related configuration information in the remote storage system, as in 860; Maccanti FIG. 9, col. 31, lines 23-48: the data storage system may continue the backup operation by uploading the table data to remote storage system 935 (which, in some embodiments, may be a remote key-value storage system). As illustrated in this example, if the verification is successful, an indication to that effect may be sent to the buffering operation/component to enable and/or initiate the uploading operation)
Maccanti does not expressly disclose:
distributing the first set of files to the plurality of nodes;
the hashing the first set of files at different levels comprising:
performing, at each of the plurality of nodes, a local hash on the distributed first set of files using the partition-by parameter setting to generate locally merged files to generally first merged files, and
performing, by the plurality of nodes, a sub-divide hash across the first merged files from the plurality of nodes to generate second merged files;
performing higher level projection on the second merged files to generate a second set of files, the second set of files being in the final file type;
However, Mihalcea addresses this by teaching:
performing higher level projection on the … files to generate a second set of files, the second set of files being in the final file type; (Mihalcea ¶ 0040-0041: build synchronizer 108 may project a local image file that is encoded in a first file format (e.g., JPEG) to a remote image file that is encoded in a second file format (e.g., PNG). This may be carried out, for example, if JPEG files render successfully in the first context but do not render successfully in the second context, such that a different encoding format may be desired. Such projection may entail automatically creating a remote asset that has a different form and/or a different content than the local asset. Such projection may also entail selectively modifying one or more portions of the local asset to produce the remote asset)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the data exporting and formatting of Maccanti with the data formatting of Mihalcea.
In addition, both of the references (Maccanti and Mihalcea) disclose features that are directed to analogous art, and they are directed to the same field of endeavor, such as data format conversion techniques.
Motivation to do so would be the teaching, suggestion, or motivation for one of ordinary skill in the art to implement consist software across different platforms/contexts and also to reduce workload as seen in Mihalcea ¶ 0040.
Motivation to do so would also be to improve the functioning of the data reformatting of Maccanti with the similar data formatting of Mihalcea but with the improved ability to ensure data is accessible at a new/second location.
Maccanti in view of Mihalcea does not expressly disclose:
distributing the first set of files to the plurality of nodes;
the hashing the first set of files at different levels comprising:
performing, at each of the plurality of nodes, a local hash on the distributed first set of files using the partition-by parameter setting to generate locally merged files to generally first merged files, and
performing, by the plurality of nodes, a sub-divide hash across the first merged files from the plurality of nodes to generate second merged files;
performing … on the second merged files to generate a second set of files,
However, Finlow-Bates addresses this by teaching:
distributing the first set of files to the plurality of nodes; (Finlow-Bates FIG. 1, ¶ 0033: The system 100 may include various computing devices 110, 120, 130, 140 connected via a network 190; Finlow-Bates FIG. 1, ¶ 0037: computing devices 110, 120, 130, 140 may each participate in a file-sharing service (e.g., execute a common file-sharing application) such that each may query filenames and download various files from one another. The fourth computing device 140 may be configured to store various files 154a-154n available for sharing with the other computing devices 110, 120, 130 via the network 190)
the hashing the first set of files at different levels comprising: performing, at each of the plurality of nodes, a local hash on the distributed first set of files using the partition-by parameter setting to generate locally merged files to generally first merged files, and (Finlow-Bates FIG. 1, ¶ 0038-0039: The first computing device 110 may be configured with the first shuffle algorithm 152 and a second hash function 180 (e.g., “Hash_2” in FIG. 1) [shows local hash] ... The second computing device 120 may be configured with a second shuffle algorithm 182 (e.g., “Shuffle Alg_2” in FIG. 1) and the shared first hash function 150 ... The third computing device 130 may be configured with both the shared first hash function 150 [also shows local hash] and the first shuffle algorithm 152; see also Finlow-Bates discussing the shared hash function in ¶ 0035: a file may be un-shuffled by a recipient computing device upon receipt from a sender device using a shared hash function applied to a filename for the file and a shuffling algorithm shared by the sender and recipient computing devices [relevant to the partition-by parameter setting])
performing, by the plurality of nodes, a sub-divide hash across the first merged files from the plurality of nodes to generate second merged files; (Finlow-Bates ¶ 0044-0046: In a first iteration 210 of such a shuffling algorithm, the computing device may divide the original file 202 into two equal-sized subdivisions or segments 212, 213 (e.g., two 512 KB subdivisions) [shows sub-divide hash] and may evaluate the first bit of the hash output 204 ... In a second iteration 220 of the shuffling algorithm, the computing device may subdivide (e.g., halve) each of the two segments 212, 213 to make four segments 222, 223, 224, 225 (e.g., four 256 KB subdivisions), and may evaluate the second bit of the hash output 204 (i.e., index ‘1’ of the hash output 204) with the third and fourth segments 222, 223 and the third bit of the hash output 204 (i.e., index ‘2’ of the hash output 204) with the fifth and sixth segments 224, 225; see Finlow-Bates ¶ 0059: The shuffle algorithm may use various techniques for programmatically shuffling, un-sorting, scrambling, and/or otherwise reordering data segments [shuffling shows relevance to the claimed hashing])
performing … on the second merged files to generate a second set of files, (Finlow-Bates FIG. 3B, ¶ 0066-0068: In block 360, the processor of the computing device may un-shuffle the plurality of data segments using the shared, looping reverse-shuffle algorithm (i.e., executing a shuffle algorithm in a reverse manner) … In optional block 366, the processor of the computing device may store the un-shuffled digital file with the public filename, such as by saving the un-shuffled (or original) file to an internal storage device, a remote database, a connected storage drive or device (e.g., external hard drive), etc.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the data exporting and formatting of Maccanti as modified with the data sharing and hashing and shuffling of Finlow-Bates.
In addition, both of the references (Maccanti as modified and Finlow-Bates) disclose features that are directed to analogous art, and they are directed to the same field of endeavor, such as data conversion techniques.
Motivation to do so would be the teaching, suggestion, or motivation for one of ordinary skill in the art to implement improved security when sharing or locally storing files as seen in Finlow-Bates ¶ 0023.
Motivation to do so would also be to improve the functioning of the data conversion of Maccanti as modified with the similar data conversion of Finlow-Bates but with the improved ability to utilize shared hashing and shuffling algorithms.
Regarding claims 2, 9, and 16, Maccanti in view of Mihalcea and Finlow-Bates teaches:
wherein the command includes an export file size for each result of the export data to be exported to a plurality of partitions in the external data store. (Maccanti describes export file size for exporting to partitions in col. 12, lines 26-45: an increase in the number of partitions may result in a larger usable table size and/or increased throughput capacity for service requests. As described herein, in some embodiments, live repartitioning (whether programmatic/automatic or explicitly initiated) may be employed to adapt to workload changes; Maccanti describes export file sizes in col. 28, lines 18-33: the backup-related metadata that may be created, stored, used, and/or updated by various backup and restore operations may include any or all of the following: Information about the backup (e.g., time at which the backup was requested, the time at which it was completed, the size of the backup and/or the items in the backup, etc.) ... Information about each partition (e.g., the format(s) in which they were stored, the size of the file uploaded into the remote storage system [also shows export file size for exporting in an external data store], an MD5 checksum or another type of checksum for the partition, the hash key start value for the partition, the hash key end value for the partition, etc.))
Claims 3, 10, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Maccanti in view of Mihalcea and Finlow-Bates in further view of Chowdhuri et al., U.S. Patent Application Publication No. 2006/0218123 (provided in IDS 01/10/2025 and utilized in the parent application(s); hereinafter Chowdhuri).
Regarding claims 3, 10, and 17, Maccanti in view of Mihalcea and Finlow-Bates teaches all the features with respect to claims 2, 9, and 16 respectively including:
wherein the lower level projection includes generating the first set of files by querying the database based on the partition-by parameter, the first set of files being smaller than the export file size. (Maccanti shows a first set of files being smaller than an export file size in col. 4, lines 15-39: the service may support automatic live repartitioning of data in response to the detection of various anomalies (e.g., failure or fault conditions, hot spots, or increases in table size and/or service request throughput), and/or explicit (e.g., pro-active and/or subscriber-initiated) live repartitioning of data to support planned or anticipated table size and/or throughput increases ... the service may in some embodiments initiate the re-sizing (scaling) and/or repartitioning of a table programmatically in response to receiving one or more requests to store, retrieve, modify, or delete items in the scalable table; see also Maccanti showing a first set of files based on a partition-by parameter in col. 32, line 56-col. 33, line 26: if the new configuration parameter value indicates an increase or decrease in storage capacity or throughput capacity, its application may automatically trigger a partition split, move, or merge operation, in some embodiments ... In other embodiments, partitions may be restored using their original configuration parameter value (or using default configuration parameter values), the new table may be made available using those configuration parameter values, and the automatically triggered repartitioning operations made be performed later (e.g., using a background process) )
Maccanti in view of Mihalcea and Finlow-Bates does not expressly disclose generating the first set of files by querying the database.
However, Chowdhuri addresses this by teaching generating the first set of files by querying the database. (Chowdhuri ¶ 0018: receiving a query specifying a join of two or more database tables; as data is retrieved from the database during processing of the query, partitioning the data into separate memory buffers; Chowdhuri ¶ 0128: The joined rows could now be further repartitioned (e.g., in memory buffers) on the attribute state and sent to different CPU threads to evaluate the grouped aggregate operations in parallel. Vertical parallelism assists in the execution of the query by allowing intermediate results to be pipelined to the next operator)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the data table management of Maccanti as modified with the database table management of Chowdhuri.
In addition, both of the references (Maccanti as modified and Chowdhuri) disclose features that are directed to analogous art, and they are directed to the same field of endeavor, such as table management.
Motivation to do so would be the teaching, suggestion, or motivation for one of ordinary skill in the art to improve query performance and to improve query processing performance as seen in Chowdhuri ¶ 0120.
Motivation to do so would also be to improve the functioning of the repartitioning of Maccanti as modified with the similar repartitioning of Chowdhuri but with the improved emphasis on parallel execution.
Claims 4, 11, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Maccanti in view of Mihalcea and Finlow-Bates in further view of Chowdhuri in further view of Prahlad et al., U.S. Patent Application Publication No. 2010/0332454 (hereinafter Prahlad).
Regarding claims 4, 11, and 18, Maccanti in view of Mihalcea and Finlow-Bates and Chowdhuri teaches all the features with respect to claims 3, 10, and 17 respectively but does not expressly disclose:
wherein hashing is performed on [the] external data store,
wherein the second set of files being larger than the first set of files.
However, Prahlad addresses this by teaching:
wherein hashing is performed on [the] external data store, (Prahlad ¶ 0144: The file may then be transferred to the cloud storage site. The cloud storage site in turn similarly creates a hash value and sends this second hash value back. The client may then compare the two hash values to verify that the cloud storage site properly received the file for storage)
wherein the second set of files being larger than the first set of files. (Prahlad ¶ 0232-0234: By containerizing the objects or blocks, the system reduces the strain on the file system namespace of the secondary cloud storage site 115, since it reduces the number of files stored on the file system of the cloud storage site 115 ... Thus, by using larger container files, the system may reduce namespace strain on the secondary cloud storage site 115; see then Prahlad ¶ 0234: If the cloud storage site 115A-N bases its charges on the number of files or directories used on the site, larger container files may be desirable ... the system may impose an absolute lower limit on the size of container files used, since there may be overhead costs (e.g., processing time and/or memory used) for each additional container file used in a storage operation)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the checksum verification of Maccanti as modified with the hash comparisons of Prahlad.
In addition, both of the references (Maccanti as modified and Prahlad) disclose features that are directed to analogous art, and they are directed to the same field of endeavor, such as data verification techniques.
Motivation to do so would be the teaching, suggestion, or motivation for one of ordinary skill in the art to offer users lower operating costs, ensure disaster recovery, while improving long-term compliance management as seen in Prahlad ¶ 0422.
Motivation to do so would also be to improve the functioning of the checksum verification of Maccanti as modified with the similar hash comparisons of Chowdhuri but with the improved ensuring of data received at the destination.
Claims 5, 12, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Maccanti in view of Mihalcea and Finlow-Bates in further view of Florendo et al., U.S. Patent Application Publication No. 2016/0147738 (provided in IDS 01/10/2025 and utilized in the parent application(s); hereinafter Florendo).
Regarding claims 5, 12, and 19, Maccanti in view of Mihalcea and Finlow-Bates teaches all the features with respect to claims 1, 8, and 15 above respectively but does not expressly disclose:
wherein the first merge files are in a temporary serializable format.
However, Florendo addresses this by teaching:
wherein the first merge files are in a temporary serializable format. (Florendo FIG. 2, ¶ 0042: portions of the table data are provided to a serializer 225, where the serializer 225 outputs a serialized version of the table data included within the table container 220)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the data buffers of Maccanti as modified with the data propagation and stream buffers of Florendo.
In addition, both of the references (Maccanti as modified and Florendo) disclose features that are directed to analogous art, and they are directed to the same field of endeavor, such as data buffering techniques.
Motivation to do so would be to improve the functioning of the data buffering of the packaged table data as in FIG. 9 of Maccanti as modified with the similar techniques in Florendo involving buffering but with the improved emphasis on serialized data being provided to a stream buffer of a fixed size. Motivation to do so would also be the teaching, suggestion, or motivation for one of ordinary skill in the art to ensure correctness post-move as seen in Florendo (¶ 0011).
Claims 6, 13, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Maccanti in view of Mihalcea and Finlow-Bates in further view of Florendo in further view of Higgins et al., U.S. Patent Application Publication No. 2020/0089710 (published March 19, 2020; provided in IDS 01/10/2025 and utilized in the parent application(s); hereinafter Higgins).
Regarding claims 6, 13, and 20, Maccanti in view of Mihalcea and Finlow-Bates and Florendo teaches all the features with respect to claims 5, 12, and 19 above respectively including the temporary serializable format. (Florendo FIG. 2, ¶ 0042-0043: portions of the table data are provided to a serializer 225, where the serializer 225 outputs a serialized version of the table data included within the table container 220; As the stream buffer 250 is filled, the corresponding data is sent to the deserializer 255, where the deserializer 255 performs operations to return the table data into the DAG format of table container 260)
Maccanti in view of Mihalcea and Finlow-Bates and Florendo further teaches:
and wherein a format of the result files is a non-arrow format. (Florendo FIG. 2, ¶ 0042-0043, notably ¶ 0043: As the stream buffer 250 is filled, the corresponding data is sent to the deserializer 255, where the deserializer 255 performs operations to return the table data into the DAG format of table container 260)
Maccanti in view of Mihalcea and Finlow-Bates and Florendo does not expressly disclose arrow file format.
However, Higgins addresses this by teaching:
wherein the temporary serializable format is an arrow file format, (Higgins ¶ 0042: serialization of data between the system and the user device may take place via a particular scheme, such as Apache Arrow. In this way, the system may package the resulting data frame via the particular scheme and then transmit the resulting data frame to the user device; Higgins ¶ 0067: The resulting data frame 106 may comprise the final result packaged according to an example scheme (e.g., Apache Arrow); the information included in the resulting data frame 106 may be streamed to the user device 120)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the data export as in Maccanti as modified with the functioning of the data propagation and serialization of Higgins.
In addition, both of the references (Maccanti as modified and Higgins) disclose features that are directed to analogous art, and they are directed to the same field of endeavor, such as data propagation management.
Motivation to do so would be to improve the functioning of the packaged table data of Maccanti as modified with the similar techniques in Higgins involving packaging desired data but with the improvements of relieving a user device of use of its own bandwidth, memory, and/or processing power to access the resulting data frame as seen in Higgins (¶ 0068).
Claims 7, 14, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Maccanti in view of Mihalcea and Finlow-Bates in further view of Florendo in further view of Higgins in further view of Dola, U.S. Patent Application Publication No. 2015/0293980 (provided in IDS 01/10/2025 and utilized in the parent application(s); hereinafter Dola).
Regarding claims 7, 14, and 21, Maccanti in view of Mihalcea and Finlow-Bates and Florendo and Higgins teaches all the features with respect to claims 6, 13, and 20 above respectively but does not expressly disclose:
wherein the final file type is comma separate value (CSV) format.
However, Dola addresses this by teaching wherein the final file type is comma separate value (CSV) format. (Dola ¶ 0031: The naming convention may also indicate a subscriber network name, a vendor name, a data format (e.g., “csv” for a comma separated value format), and a file type (e.g., “gz” for a compressed file type). Filenames and the associated naming conventions may be utilized to facilitate the routing and processing of files)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the data propagation as in Maccanti as modified with the functioning of the data propagation of Dola.
In addition, both of the references (Maccanti as modified and Dola) disclose features that are directed to analogous art, and they are directed to the same field of endeavor, such as data propagation management.
Motivation to do so would also be the teaching, suggestion, or motivation for one of ordinary skill in the art to process subscriber data into a target format that may be desirable for analytics as seen in Dola (¶ 0011).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Wong et al., U.S. Patent Application Publication No. 2020/0117546; see Wong ¶ 0035-0038 describing subdividing an index into multiple partitions to build a perfect hash function for a subset of fingerprints, relevant to at least the independent claim limitations involving a local hash and a sub-divide hash.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JEDIDIAH P FERRER whose telephone number is (571)270-7695. The examiner can normally be reached Monday-Friday 12:00pm-8:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kavita Stanley can be reached at (571)272-8352. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J.P.F/Examiner, Art Unit 2153 January 9, 2026
/KRIS E MACKES/Primary Examiner, Art Unit 2153