Last updated: April 19, 2026
Application No. 18/626,007
ADDRESSING MEMORY LIMITS FOR PARTITION TRACKING AMONG WORKER NODES

Non-Final OA §103§DP
Filed
Apr 03, 2024
Examiner
GLASSER, DARA J
Art Unit
2161
Tech Center
2100 — Computer Architecture & Software
Assignee
Splunk Inc.
OA Round
1 (Non-Final)
This examiner grants 58% of cases after interview

— +53.9% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 163 resolved cases, 2023–2026
Examiner Intelligence

GLASSER, DARA J View full profile →
Grants 58% of resolved cases
Career Allow Rate
95 granted / 163 resolved
+3.3% vs TC avg
Strong +54% interview lift
Without
With
+53.9%
Interview Lift
resolved cases with interview
Typical timeline
3y 7m
Avg Prosecution
9 currently pending
Career history
172
Total Applications
across all art units
Statute-Specific Performance

§101
11.6%
-28.4% vs TC avg
§103
47.6%
+7.6% vs TC avg
§102
9.5%
-30.5% vs TC avg
§112
26.7%
-13.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 163 resolved cases
Office Action

§103 §DP
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on June 14, 2024 was filed is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, 365(c), or 386(c) is acknowledged. Applicant has not complied with one or more conditions for receiving the benefit of an earlier filing date under 35 U.S.C. 120 as follows:
The later-filed application must be an application for a patent for an invention which is also disclosed in the prior application (the parent or original nonprovisional application or provisional application). The disclosure of the invention in the parent application and in the later-filed application must be sufficient to comply with the requirements of 35 U.S.C. 112(a) or the first paragraph of pre-AIA  35 U.S.C. 112, except for the best mode requirement. See Transco Products, Inc. v. Performance Contracting, Inc., 38 F.3d 551, 32 USPQ2d 1077 (Fed. Cir. 1994).
The disclosure of the prior-filed application, Application No. 16/657,867, fails to provide adequate support or enablement in the manner provided by 35 U.S.C. 112(a) or pre-AIA  35 U.S.C. 112, first paragraph for claim 24 of this application. 
The prior-filed application does not disclose  “selecting the first partition for aggregation based on the first partition having a minimum number of records compared to other partitions of the set of data partitions,” as recited by claim 24, when “relocating at least a first record having a field value from . . . the first partition to . . . the second partition, wherein the second partition has a highest number of records sharing the field value, among the set of data partitions,” as recited by parent claim 1.
According to the as-filed specification of Application No. 16/657,867, “the node 3306 can shuffle records between partitions, such that records having the same key value are co-located within the same partition . . . shuffling can occur based on a highest count of a key value, such that all records for a key value are directed to a partition including a highest count record for the key value” (see [1320]). However, a partition is not selected, such that a record of the partition with a particular field value is relocated, based on the partition having a minimum number of records compared to the other partitions. Rather, paragraph [1320] discloses that any record, containing a particular field value, that is located on a partition that does not have the highest number of records sharing the particular field value is relocated to the partition containing the highest number records sharing the particular field value.
The as-filed specification of Application No. 16/657,867 further recites that “the node 3306 collapses partitions by identifying a partition with a least number of records among the group of partitions, and moving individual records from the partition with the least number of records to a partition with a greatest number of records until that destination partition includes a maximum number of records” (see [1322]). However, each record that is moved from the partition with the least number of records is not relocated based on the record having a field value matching a field value of the destination partition. Nor is the destination selected based on having a highest number of records sharing the field value. Rather, according to this paragraph of the specification, the records are relocated based upon location within the partition having the least number of records, regardless of field value. Moreover, paragraph [1332] discloses that the destination partition is selected based on having a maximum number of records, regardless of field value.
Therefore, the specification fails to support “selecting the first partition for aggregation based on the first partition having a minimum number of records compared to other partitions of the set of data partitions,” as recited by claim 24, when “relocating at least a first record having a field value from . . . the first partition to . . . the second partition, wherein the second partition has a highest number of records sharing the field value, among the set of data partitions,” as recited by parent claim 1. Prior-filed Application No. 16/657,916 also fails to support  “selecting the first partition for aggregation based on the first partition having a minimum number of records compared to other partitions of the set of data partitions,” as recited by claim 24, when “relocating at least a first record having a field value from . . . the first partition to . . . the second partition, wherein the second partition has a highest number of records sharing the field value, among the set of data partitions,” as recited by parent claim 1, for the same reasons as Application No. 16/657,867.

Claim Objections
Claim 7 is objected to because of the following informalities: Claim 7 recites “aggregating records of second particular partition,” which is missing an article prior to second. Examiner suggests revising the limitation to recite “aggregating records of the second particular partition” (emphasis added). Appropriate correction is required.
Claim 25 is objected to because of the following informalities: “selecting the second partition for aggregation based the second partition having a highest number of records,” which is grammatically incorrect. Examiner suggests revising the limitation to recite “selecting the second partition for aggregation based on the second partition having a highest number of records” (emphasis added). Appropriate correction is required.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
Claims 1-30 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-28 of U.S. Patent No. 11,989,194 in view of Bellamkonda et al. (US Publication No. 2006/0116989). 

Instant Application
 
 
U.S. Patent No. 11,989,194
 
Claim
Limitation
Claim
Limitation
1
A computer-implemented method comprising: obtaining, by at least one worker node a plurality of records associated with a query;
1
A computer-implemented method comprising: obtaining, by at least one worker node of a distributed query execution environment, a chunk of data, wherein the chunk of data comprises a plurality of records associated with a query;
 
assigning records of the plurality of records to individual data partitions of a set of data partitions at the at least one worker node, wherein individual partitions of the set of data partitions correspond to distinct portions of physical data storage of the at least one worker node; and 
 
assigning records of the plurality of records to individual data partitions of a set of data partitions at the at least one worker node, wherein individual partitions of the set of data partitions correspond to distinct portions of physical data storage of the at least one worker node;
 
 
 
 
 
 

 
based on a number of data partitions exceeding a threshold value, combining records across partitions within the set of partitions, wherein combining records across partitions within the set of partitions combines records sharing a field value into a particular partition; combining the records sharing the field value in the particular partition into a single record having the field value; and
 1
reducing a number of data partitions in the set of data partitions by: aggregating records of a first partition with records of a second partition by relocating at least a first record having a field value from the distinct portion of physical data storage corresponding to the first partition to the distinct portion of physical data storage corresponding to the second partition, wherein the second partition has a highest number of records sharing the field value, among the set of data partitions, and 
 
reducing a number of partitions in the set of partitions by: selecting an additional partition from the set of data partitions to be aggregated with the particular partition, wherein the additional partition is selected from among the set of data partitions based on the additional partition having a highest number of records, among the set of data partitions, that does not exceed a maximum number of records allowable within the additional partition, aggregating records of the particular partition with records of the additional partition by relocating at least the single record having the field value from the distinct portion of physical data storage corresponding to the particular partition to the distinct portion of physical data storage corresponding to the additional partition, and
 
removing the first partition from the at least one worker node.
 
removing the particular partition from the at least one worker node.

 2
The computer-implemented method of Claim 1, wherein the set of data partitions is a first group of partitions, and wherein the at least one worker node maintains a plurality of groups of partitions, each group of partitions associated with a subset of potential field values.
 2
The computer-implemented method of claim 1, wherein the set of data partitions is a first group of partitions, and wherein the at least one worker node maintains a plurality of groups of partitions, each group of partitions associated with a subset of potential values of the field.
3
The computer-implemented method of Claim 1, wherein the set of data partitions is a first group of partitions, wherein the at least one worker node maintains a plurality of groups of partitions, and wherein a number of the groups is equal to a number of processor cores of the at least one worker node.

 3
The computer-implemented method of claim 1, wherein the set of data partitions is a first group of partitions, wherein the at least one worker node maintains a plurality of groups of partitions, and wherein a number of the groups is equal to a number of processor cores of the at least one worker node.

4
The computer-implemented method of Claim 1, wherein the set of data partitions is a first group of data partitions, and wherein the method further comprises: assigning one or more additional records of the plurality of records to individual data partitions of a second group of data partitions at the at least one worker node;
4
The computer-implemented method of claim 1, wherein the set of data partitions is a first group of partitions, and wherein the method further comprises: assigning one or more additional records of the plurality of records to individual data partitions of a second group of data partitions at the at least one worker node;

based on a number of data partitions satisfying a threshold value, combining records across partitions within the second group of data partitions, wherein combining records across partitions within the second group of data partitions combines records sharing a second field value in a particular partition of the second group of data partitions;

based on the number of data partitions satisfying a threshold value, combining records across partitions within the second group of partitions, wherein combining records across partitions within the second group of partitions combines records sharing a second field value in a particular partition of the second group;

combining the records sharing the field value in the particular partition of the second group of data partitions into an individual record having the second field value; 

combining the records sharing the field value in the particular partition of the second group into an individual record having the second field value; and

reducing the second group of data partitions by aggregating records of the particular partition of the second group with records of an additional partition of the second group and removing the particular partition of the second group from the at least one worker node; and

reducing the second group of data partitions by aggregating records of the particular partition of the second group with records of an additional partition of the second group and removing the particular partition of the second group from the at least one worker node; and

wherein operations related to the second group of data partitions occur concurrently with operations related to the first group of data partitions.

wherein operations related to the second group of data partitions occur concurrently with operations related to the first group of data partitions.
5
The computer-implemented method of Claim 1, wherein each data partition of the set of data partitions contains records received at the at least one worker node during a distinct time period.
5
The computer-implemented method of claim 1, wherein each data partition of the set of data partitions contains records received at the at least one worker node during a distinct time period.
6
The computer-implemented method of Claim 1, wherein assigning records of the plurality of records to individual data partitions of the set of data partitions at the at least one worker node comprises assigning records to an individual data partition of the set of data partitions until the individual data partition reaches a maximum number of records and then assigning records to a second individual data partition of the set of data partitions.
6
The computer-implemented method of claim 1, wherein assigning records of the plurality of records to individual data partitions of the set of data partitions at the at least one worker node comprises assigning records to an individual data partition of the set of data partitions until the individual data partition reaches a maximum number of records and then assigning records to a second individual data partition of the set of data partitions.
7
The computer-implemented method of Claim 1 further comprising: obtaining one or more chunks of data, the one or more chunks of data comprising a second plurality of records associated with the query;
7
The computer-implemented method of claim 1 further comprising: obtaining one or more additional chunks of data, the additional chunks comprising a second plurality of records associated with the query;

assigning records of the second plurality of records to individual data partitions of the set of data partitions at the at least one worker node;

assigning records of the second plurality of records to individual data partitions of the set of data partitions at the at least one worker node;



based on a number of data partitions satisfying a threshold value, combining records across partitions within the set of data partitions, wherein combining records across partitions within the set of data partitions combines records sharing a second field value in a second particular partition;

based on the number of data partitions satisfying the threshold value, combining records across partitions within the set of partitions, wherein combining records across partitions within the set of partitions combines records sharing a second field value in a second particular partition;

combining the records sharing the second field value in the second particular partition into an individual record having the second field value; and

combining the records sharing the second field value in the second particular partition into an individual record having the second field value; and

reducing the set of data partitions by aggregating records of second particular partition with records of another partition and removing the second particular partition from the at least one worker node.

reducing the set of data partitions by aggregating records of the second particular partition with records of another partition and removing the second particular partition from the at least one worker node.
8
The computer-implemented method of Claim 1, wherein each record of the plurality of records reflects one or more events detected within raw machine data.
8
The computer-implemented method of claim 1, wherein each record of the plurality of records reflects one or more events detected within raw machine data.
9
The computer-implemented method of Claim 1, wherein each record of the plurality of records reflects one or more events detected within raw machine data, and wherein a chunk of data is obtained from an indexer device configured to generate a record from the one or more events.
9
The computer-implemented method of Claim 1, wherein each record of the plurality of records reflects one or more events detected within raw machine data, and wherein the chunk is obtained from an indexer device configured to generate the record from the one or more events.
10
The computer-implemented method of Claim 1, wherein the first partition includes records obtained from multiple different chunks of data.
10
The computer-implemented method of claim 1, wherein the particular partition includes records obtained from multiple different chunks.
11
The computer-implemented method of Claim 7 further comprising, prior to combining records across partitions within the set of data partitions, combining records in each partition that have shared field values.
11
The computer-implemented method of claim 1 further comprising, prior to combining records across partitions within the set of partitions, combining records in each partition that have shared field values.
12
The computer-implemented method of Claim 7, wherein the number of data partitions is a number of data partitions at the at least one worker node.
12
The computer-implemented method of claim 1, wherein the number of data partitions is a number of data partitions at the at least one worker node.
13
The computer-implemented method of Claim 7, wherein the at least one worker node is associated with a distributed query execution environment, wherein the at least one worker node is one of a plurality of worker nodes within the distributed query execution environment, and wherein the number of data partitions is a number of data partitions across the plurality of worker nodes.
13
The computer-implemented method of claim 1, wherein the at least one worker node is one of a plurality of worker nodes within the distributed query execution environment, and wherein the number of data partitions is a number of data partitions across the plurality of worker nodes.
14
The computer-implemented method of Claim 7, wherein the at least one worker node is associated with a distributed query execution environment, wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises obtaining the number of data partitions from the search master.
14
The computer-implemented method of claim 1, wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises obtaining the number of data partitions from the search master.
15
The computer-implemented method of Claim 7, wherein the at least one worker node is associated with a distributed query execution environment, wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises reporting the number of data partitions to the search master.
15
The computer-implemented method of claim 1, wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises reporting the number of data partitions to the search master.
16
The computer-implemented method of Claim 7, wherein the at least one worker node is associated with a distributed query execution environment, wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises reporting the number of data partitions to the search master and obtaining the number of data partitions from the search master in response to the reporting.
16
The computer-implemented method of claim 1, wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises reporting the number of data partitions to the search master and obtaining the number of data partitions from the search master in response to the reporting.
17
The computer-implemented method of Claim 7, wherein the threshold value is set based on a memory allocated to track the number of data partitions.
17
The computer-implemented method of claim 1, wherein the threshold is set based on a memory allocated to track the number of data partitions.
18
The computer-implemented method of Claim 7, wherein the threshold value is set based on a memory allocated to track the number of data partitions, and wherein the memory allocated to track the number of data partitions is determined from a data type of a variable allocated to track the number of data partitions.
18
The computer-implemented method of claim 1, wherein the threshold is set based on a memory allocated to track the number of data partitions, and wherein the memory allocated to track the number of data partitions is determined from a data type of a variable allocated to track the number of data partitions.
19
The computer-implemented method of Claim 7, wherein the threshold value is set based on a memory allocated to track the number of data partitions, and wherein the threshold value is set to avoid an overflow error in the memory when the number of data partitions satisfies the threshold value.
19
The computer-implemented method of claim 1, wherein the threshold is set based on a memory allocated to track the number of data partitions, and wherein the threshold is set to avoid an overflow error in the memory when the number of data partitions satisfies the threshold value.
20
The computer-implemented method of Claim 1 further comprising: combining, within the second partition, two or more records sharing the field value into an individual record having the field value; and
20
The computer-implemented method of claim 1 further comprising: combining, within the second partition, two or more records sharing the field value into an individual record having the field value; and

reducing the set of data partitions by aggregating records of the second partition with records of a third partition and removing the second partition from the at least one worker node.

reducing the set of data partitions by aggregating records of the additional partition with records of another partition and removing the additional partition from the at least one worker node.
21
The computer-implemented method of Claim 1, wherein the query is associated with multiple chunks of data, and wherein the method is implemented prior to one or more chunks of data being obtained at the at least one worker node.
21
The computer-implemented method of claim 1, wherein the query is associated with multiple chunks, and wherein the method is implemented prior to one or more additional chunks being obtained at the at least one worker node.
22
The computer-implemented method of Claim 1, wherein the field value is derived from a combination of fields of the plurality of records.
22
The computer-implemented method of claim 1, wherein the field value is derived from a combination of fields of the plurality of records.
23
The computer-implemented method of Claim 1, wherein reducing the number of data partitions by aggregating records of the first partition with records of the second partition comprises selecting the first partition for aggregation based on a number of records within the first partition.
23
The computer-implemented method of claim 1, wherein reducing the set of data partitions by aggregating records of the particular partition with records of an additional partition comprises selecting the particular partition for aggregation based on a number of records within the particular partition.
24
The computer-implemented method of Claim 1, wherein reducing the number of data partitions by aggregating records of the first partition with records of the second partition comprises selecting the first partition for aggregation based on the first partition having a minimum number of records compared to other partitions of the set of data partitions.
24
The computer-implemented method of claim 1, wherein reducing the set of data partitions by aggregating records of the particular partition with records of an additional partition comprises selecting the particular partition for aggregation based on the particular partition having a minimum number of records compared to other partitions of the set of data partitions.
25
The computer-implemented method of Claim 1, wherein reducing the number of data partitions by aggregating records of the first partition with records of the second partition comprises selecting the second partition for aggregation based the second partition having a highest number of records, compared to other partitions of the set of data partitions, that does not exceed a maximum number of records allowable within the second partition.
1
A computer-implemented method comprising: obtaining, by at least one worker node of a distributed query execution environment, a chunk of data, wherein the chunk of data comprises a plurality of records associated with a query;



assigning records of the plurality of records to individual data partitions of a set of data partitions at the at least one worker node, wherein individual partitions of the set of data partitions correspond to distinct portions of physical data storage of the at least one worker node;



based on a number of data partitions exceeding a threshold value, combining records across partitions within the set of partitions, wherein combining records across partitions within the set of partitions combines records sharing a field value into a particular partition; combining the records sharing the field value in the particular partition into a single record having the field value; and



reducing a number of partitions in the set of partitions by: selecting an additional partition from the set of data partitions to be aggregated with the particular partition, wherein the additional partition is selected from among the set of data partitions based on the additional partition having a highest number of records, among the set of data partitions, that does not exceed a maximum number of records allowable within the additional partition, aggregating records of the particular partition with records of the additional partition by relocating at least the single record having the field value from the distinct portion of physical data storage corresponding to the particular partition to the distinct portion of physical data storage corresponding to the additional partition, and



removing the particular partition from the at least one worker node.



Claims 1-28 of U.S. Patent No. 11,989,194 do not specifically disclose wherein the second partition has a highest number of records sharing the field value, among the set of data partitions. However, Bellamkonda teaches
wherein the second partition [persistently stored partition] has a highest number of records [entries] sharing the field value [customer ID value], among the set of data partitions [partitions storing aggregations] (see e.g., [0003] for data in a relational database management system (RDBMS) being aggregated in response to a query, such as a SQL query, that includes an aggregation function (e.g., SUM, COUNT, AVG, etc.) with a GROUP BY clause, [0035] for if the query that specifies the aggregation operation specifies grouping by customer ID (e.g., GROUPBY cust_id), then the value in the customer ID field being input to the hash function 104, which generates a hash value and the hash function being constructed so that the hash values are determinative of to which partition 108a-108d each data item will be partitioned, [0036] for each entry in a partition storing a running tally of the measure being aggregated and for example, if a query specifying sum of sales with group by customer ID is issued against table 102, then during processing of the query, an entry that includes a running summation of the sales values being stored in volatile memory in association with the corresponding partition 108a-108d, for each unique value of cust_id, [0058] for while processing the input data items, if a data item is read, for which there is not enough volatile memory available to store the corresponding entry in the partition, then a partition being selected to spill to persistent storage (e.g., disk) and in one embodiment, the largest partition, i.e., the partition having the most volatile memory slots, being selected for spillage to persistent storage, [0060] for once a partition is spilled to persistent storage, then data items for that partition that are processed later (i.e., after that partition is spilled) no longer being aggregated on-the-fly, but simply being hashed and stored in volatile memory slots corresponding to the partition, and selecting the "victim" partition, i.e., the partition that gets spilled to disk, being generally based on the following, a partition that has already been spilled to disk being selected so as not to increase the number of partitions to be processed in phase two, and to be able to keep aggregating the partitions that are in memory. Since a partition that has already been spilled to disk is selected to be spilled to disk again, the persistently stored partition has, among the partitions storing aggregations, the highest number of entries sharing the same customer ID value as the partition selected to be spilled to disk.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the method of claims 1-28 of U.S. Patent No. 11,989,194 wherein the second partition has a highest number of records sharing the field value, among the set of data partitions, as taught by Bellamkonda, for the benefit of freeing the maximum amount of full memory (see e.g., Bellamkonda. [0058]).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-4, 7-16, and 21-30 are rejected under 35 U.S.C. 103 as being unpatentable over Gould et al. (US Publication No. 2005/0102325) in view of Skjolsvold et al. (US Publication No. 2013/0204991) and further in view of Bellamkonda et al. (US Publication No. 2006/0116989).

As to claim 1, Gould teaches a computer-implemented method comprising:
obtaining, by at least one worker node [profiling and processing subsystem], a plurality of records associated with a query (see e.g., [0072] for a data processing system 10 including a profiling and processing subsystem 20, which is used to process data from data sources 30, [0076] for the profiling and processing subsystem 20 including a profiling module 100, which reads data directly from a data source without necessarily landing a complete copy of the data to a storage medium before profiling in units of discrete work elements such as individual records, [0087] for the profiling module 100 obtaining an initial portion of the data set, [0079] for the profiling module 100 reading records from a data source, and [0203] for profiling module 100 generating a "virtual table" that includes fields from the multiple sources and the virtual table being generated, for example, by performing a join operation on the sources using a key field that is common to the sources. The profiling and processing subsystem obtains records associated with a join query.);
assigning records of the plurality of records to individual data partitions of a set of data partitions at the at least one worker node, wherein individual partitions of the set of data partitions correspond to distinct portions of physical data storage [parallel processors and/or computers] of the at least one worker node (see e.g., [0125] for the partition by round-robin component 612 taking records from the single or multiple partitions of the input data set 402 and re-partitioning the records among a number of parallel processors and/or computers (e.g., as selected by the user) in order to balance the work load among the processors and/or computers. The profiling and processing subsystem assigns the records to partitions. The partitions correspond to distinct processors and/or computers.); and
reducing a number of data partitions in the set of data partitions by:
aggregating records [census elements] of a first partition with records [census elements]  of a second partition by relocating at least a first record having a field value from the distinct portion of physical data storage corresponding to the first partition to the distinct portion of physical data storage corresponding to the second partition (see e.g., [0126] for the canonicalize component 616 taking in a flow of records and sending out a flow of census elements containing a field/value pair representing values for each field in an input record, an input record with ten fields yielding a flow of ten census elements and the census elements flowing into a local rollup field/value component which (for each partition) takes occurrences of the same value for the same field and combines them into one census element including a count of the number of occurrences, [0128] for the partition by field/value component 624 re-partitioning the census elements by field and value so that the rollup process performed in the global rollup field/value component 626 can add the occurrences calculated in different partitions to produce a total occurrences count in a single census element for each unique field/value pair contained within the profiled records and the global rollup field/value component 626 processing these census elements in potentially multiple partitions for a potentially parallel file represented by the census file component 410, and [0129] for a partition by field component 632 reading a flow of census elements from the census file component 410 and re-partitioning the census elements according to a hash value based on the field such that census records with the same field (but different values) are in the same partition. Census elements of a first partition are aggregated with census elements of a second partition by relocating at least a first census element having a field value from the first partition to the second partition. The number of partitions is then reduced when census records with the same field but different values are merged into a single partition.).
Gould does not specifically disclose reducing a number of data partitions in the set of data partitions by: removing the first partition from the at least one worker node. However, Skjolsvold teaches reducing a number of data partitions in the set of data partitions by:
removing the first partition from the at least one worker node [current server] (see e.g., [0083] for when partitions are merged, as an initial step the partitions for merger being unassigned from the current server, for example, a first partition on server S2 having a low key value of K and a high key value of M, in this example, the epoch number for the first partition being 7, a second partition on server S4 having a low key value of M and a high key value of N, the epoch value for the second partition being 9 in this example, as an initial step, the partitions being unassigned from their respective servers, so that the partition table shows a non-assigned value for the server, the two partition entries being then replaced with a single entry having a low key of K and a high key of N, the epoch number assigned to this partition being one greater than the highest value of the merged partitions, which corresponds to 10 in this example, and the new partition then being assigned to a server. The number of partitions is reduced by removing the first partition from the current server.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to reduce a number of data partitions in the set of data partitions by: removing the first partition from the at least one worker node, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).
Gould in view of Skjolsvold does not specifically disclose wherein the second partition has a highest number of records sharing the field value, among the set of data partitions. However, Bellamkonda teaches
wherein the second partition [persistently stored partition] has a highest number of records [entries] sharing the field value [customer ID value], among the set of data partitions [partitions storing aggregations] (see e.g., [0003] for data in a relational database management system (RDBMS) being aggregated in response to a query, such as a SQL query, that includes an aggregation function (e.g., SUM, COUNT, AVG, etc.) with a GROUP BY clause, [0035] for if the query that specifies the aggregation operation specifies grouping by customer ID (e.g., GROUPBY cust_id), then the value in the customer ID field being input to the hash function 104, which generates a hash value and the hash function being constructed so that the hash values are determinative of to which partition 108a-108d each data item will be partitioned, [0036] for each entry in a partition storing a running tally of the measure being aggregated and for example, if a query specifying sum of sales with group by customer ID is issued against table 102, then during processing of the query, an entry that includes a running summation of the sales values being stored in volatile memory in association with the corresponding partition 108a-108d, for each unique value of cust_id, [0058] for while processing the input data items, if a data item is read, for which there is not enough volatile memory available to store the corresponding entry in the partition, then a partition being selected to spill to persistent storage (e.g., disk) and in one embodiment, the largest partition, i.e., the partition having the most volatile memory slots, being selected for spillage to persistent storage, [0060] for once a partition is spilled to persistent storage, then data items for that partition that are processed later (i.e., after that partition is spilled) no longer being aggregated on-the-fly, but simply being hashed and stored in volatile memory slots corresponding to the partition, and selecting the "victim" partition, i.e., the partition that gets spilled to disk, being generally based on the following, a partition that has already been spilled to disk being selected so as not to increase the number of partitions to be processed in phase two, and to be able to keep aggregating the partitions that are in memory. Since a partition that has already been spilled to disk is selected to be spilled to disk again, the persistently stored partition has, among the partitions storing aggregations, the highest number of entries sharing the same customer ID value as the partition selected to be spilled to disk.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould in view of Skjolsvold wherein the second partition has a highest number of records sharing the field value, among the set of data partitions, as taught by Bellamkonda, for the benefit of freeing the maximum amount of full memory (see e.g., Bellamkonda. [0058]).

As to claim 2, the limitations of parent claim 1 have been discussed above. Gould does not specifically disclose wherein the set of data partitions is a first group of partitions, and wherein the at least one worker node maintains a plurality of groups of partitions, each group of partitions associated with a subset of potential field values. However, Skjolsvold teaches
wherein the set of data partitions is a first group of partitions, and wherein the at least one worker node [partition master] maintains a plurality of groups of partitions [partitions assigned to partition servers], each group of partitions associated with a subset of potential field values [keys]  (see e.g., [0017] for a key being a value from a namespace or domain, an example of a namespace being an identifier corresponding to all storage accounts in a cloud computing environment, and in such an example, a key corresponding to an account name, account number, or another identifier that allows a specific account to be referenced, [0018] for a "partition" being a range defined by a low (inclusive) and high (exclusive) key, [0019] for a "partition server" being a virtual machine within a cloud computing environment that corresponds to a role instance for serving zero or more partitions, [0020] for a "partition master" being a role that manages partition servers for a given type of role, such as by assigning and unassigning partitions to partition servers, and [0029] for based on a namespace, a computation being partitioned so that multiple partition servers handle or perform different portions of processing for the namespace, each partition corresponding to a range of key values, and when a partition is assigned to a partition server, the server performing the desired computation for any requests that contain a key value within the range corresponding to an assigned partition. The partition master maintains a plurality of groups of partitions assigned to partition servers. Each group of partitions assigned to a partition server is associated with a subset of keys of an identifier.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein the set of data partitions is a first group of partitions, and wherein the at least one worker node maintains a plurality of groups of partitions, each group of partitions associated with a subset of potential field values, as taught by Skjolsvold, for the benefit of providing a framework for handling features such as scalability, fault tolerance, and/or availability while reducing or minimizing the amount of effort required to address these features (see e.g., Skjolsvold, [0016]).

As to claim 3, the limitations of parent claim 1 have been discussed above. Gould does not specifically disclose wherein the set of data partitions is a first group of partitions, wherein the at least one worker node maintains a plurality of groups of partitions, and wherein a number of the groups is equal to a number of processor cores of the at least one worker node. However, Skjolsvold teaches
wherein the set of data partitions is a first group of partitions, wherein the at least one worker node [partition master] maintains a plurality of groups of partitions [partitions assigned to partition servers], and wherein a number of the groups is equal to a number of processor cores [partition servers] of the at least one worker node (see e.g., [0017] for a key being a value from a namespace or domain, an example of a namespace being an identifier corresponding to all storage accounts in a cloud computing environment, and in such an example, a key corresponding to an account name, account number, or another identifier that allows a specific account to be referenced, [0018] for a "partition" being a range defined by a low (inclusive) and high (exclusive) key, [0019] for a "partition server" being a virtual machine within a cloud computing environment that corresponds to a role instance for serving zero or more partitions, [0020] for a "partition master" being a role that manages partition servers for a given type of role, such as by assigning and unassigning partitions to partition servers, and [0029] for based on a namespace, a computation being partitioned so that multiple partition servers handle or perform different portions of processing for the namespace, each partition corresponding to a range of key values, and when a partition is assigned to a partition server, the server performing the desired computation for any requests that contain a key value within the range corresponding to an assigned partition. The partition master maintains a plurality of groups of partitions assigned to partition servers. The number of groups of partitions assigned to partition servers is equal to the number of partition servers.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein the set of data partitions is a first group of partitions, wherein the at least one worker node maintains a plurality of groups of partitions, and wherein a number of the groups is equal to a number of processor cores of the at least one worker node, as taught by Skjolsvold, for the benefit of providing a framework for handling features such as scalability, fault tolerance, and/or availability while reducing or minimizing the amount of effort required to address these features (see e.g., Skjolsvold, [0016]).

As to claim 4, the limitations of parent claim 1 have been discussed above. Gould teaches wherein the set of data partitions is a first group of data partitions, and wherein the method further comprises:
assigning one or more additional records of the plurality of records to individual data partitions of a second group of data partitions at the at least one worker node (see e.g., [0094] for a flow control mechanism being implemented using input queues for the links entering a component, [0095] for when two components are connected by a flow, the upstream component sending work elements to the downstream component as long as the downstream component keeps consuming the work elements and if the downstream component falls behind, the upstream component filling up the input queue of the downstream component and stop working until the input queue clears out again, and [0125] for the partition by round-robin component 612 taking records from the single or multiple partitions of the input data set 402 and re-partitioning the records among a number of parallel processors and/or computers (e.g., as selected by the user) in order to balance the work load among the processors and/or computers. The profiling and processing subsystem assigns additional input records to a second group of partitions.);
combining records [census elements] across partitions within the second group of data partitions, wherein combining records across partitions within the second group of data partitions combines records sharing a second field value in a particular partition of the second group of data partitions (see e.g., [0126] for the canonicalize component 616 taking in a flow of records and sending out a flow of census elements containing a field/value pair representing values for each field in an input record, an input record with ten fields yielding a flow of ten census elements and the census elements flowing into a local rollup field/value component which (for each partition) takes occurrences of the same value for the same field and combines them into one census element including a count of the number of occurrences and [0128] for the partition by field/value component 624 re-partitioning the census elements by field and value. Census elements across the second group of partitions that share a field value are combined in a particular partition.); 
combining the records sharing the field value in the particular partition of the second group of data partitions into an individual record [census element] having the second field value (see e.g., [0128] for the partition by field/value component 624 re-partitioning the census elements by field and value so that the rollup process performed in the global rollup field/value component 626 can add the occurrences calculated in different partitions to produce a total occurrences count in a single census element for each unique field/value pair contained within the profiled records. Census elements sharing the field value in the particular partition of the second group are combined into a single census element having the field value.); and
reducing the second group of data partitions by aggregating records of the particular partition of the second group of data partitions with records of an additional partition of the second group of data partitions (see e.g., [0128] for the global rollup field/value component 626 processing these census elements in potentially multiple partitions for a potentially parallel file represented by the census file component 410 and [0129] for a partition by field component 632 reading a flow of census elements from the census file component 410 and re-partitioning the census elements according to a hash value based on the field such that census records with the same field (but different values) are in the same partition. Census elements of the particular partition are aggregated with census elements of an additional partition having the same field but different value. This reduces the number of partitions since there are fewer fields than values per field.).
Gould does not specifically disclose based on a number of data partitions satisfying a threshold value, combining records across partitions within the second group of data partitions; reducing the second group of data partitions by removing the particular partition of the second group of data partitions from the at least one worker node; and wherein operations related to the second group of data partitions occur concurrently with operations related to the first group of data partitions. However, Skjolsvold teaches
based on a number of data partitions satisfying a threshold value [value approaching the upper limit], combining records across partitions within the second group of data partitions (see e.g., [0063] for merging of partitions allowing partitions that have lower amounts of activity to be combined, this reducing the overhead required to track and maintain the various partitions for a data set, optionally, a user defining an upper limit on the number of partitions for a namespace, the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit, and the upper limit for number of partitions being dynamically configured and [0080] for it being often desirable to avoid having too many partitions, as the maximum number of partitions is approached for a server, the likelihood of merging partitions increasing, and as an example, it being desirable to maintain between 5 and 8 partitions per server. Based on the number of data partitions exceeding a value approaching the upper limit, partitions may be combined.);
reducing the second group of data partitions by removing the particular partition [first partition] of the second group of data partitions from the at least one worker node [current server] (see e.g., [0083] for when partitions are merged, as an initial step the partitions for merger being unassigned from the current server, for example, a first partition on server S2 having a low key value of K and a high key value of M, in this example, the epoch number for the first partition being 7, a second partition on server S4 having a low key value of M and a high key value of N, the epoch value for the second partition being 9 in this example, as an initial step, the partitions being unassigned from their respective servers, so that the partition table shows a non-assigned value for the server, the two partition entries being then replaced with a single entry having a low key of K and a high key of N, the epoch number assigned to this partition being one greater than the highest value of the merged partitions, which corresponds to 10 in this example, and the new partition then being assigned to a server. The number of partitions is reduced by removing the first partition from the current server.); and
wherein operations related to the second group of data partitions [partitions assigned to the second partition server] occur concurrently with operations related to the first group of data partitions [partitions assigned to the first partition server] (see e.g., [0017] for a key being a value from a namespace or domain, an example of a namespace being an identifier corresponding to all storage accounts in a cloud computing environment, and iIn such an example, a key corresponding to an account name, account number, or another identifier that allows a specific account to be referenced, [0018] for a "partition" being a range defined by a low (inclusive) and high (exclusive) key, [0019] for a "partition server" being a virtual machine within a cloud computing environment that corresponds to a role instance for serving zero or more partitions, [0020] for a "partition master" being a role that manages partition servers for a given type of role, such as by assigning and unassigning partitions to partition servers, [0029] for based on a namespace, a computation being partitioned so that multiple partition servers handle or perform different portions of processing for the namespace, each partition corresponding to a range of key values, and when a partition is assigned to a partition server, the server performing the desired computation for any requests that contain a key value within the range corresponding to an assigned partition, and [0061] for a heartbeat or another type of message being used to query each server regarding the server's current partition assignments, this query including a query for the name of the storage object for a server, and the tasks proceeding in parallel. Operations related to the partitions assigned to the first partition server occur concurrently with operations related to the partitions assigned to the second partition server).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to based on a number of data partitions satisfying a threshold value, combine records across partitions within the second group of data partitions; reduce the second group of data partitions by removing the particular partition of the second group of data partitions from the at least one worker node; and wherein operations related to the second group of data partitions occur concurrently with operations related to the first group of data partitions, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

As to claim 7, the limitations of parent claim 1 have been discussed above. Gould teaches
obtaining one or more chunks of data, the one or more chunks comprising a second plurality of records associated with the query (see e.g., [0094] for a flow control mechanism being implemented using input queues for the links entering a component, [0095] for when two components are connected by a flow, the upstream component sending work elements to the downstream component as long as the downstream component keeps consuming the work elements and if the downstream component falls behind, the upstream component filling up the input queue of the downstream component and stop working until the input queue clears out again, and [0203] for profiling module 100 generating a "virtual table" that includes fields from the multiple sources and the virtual table being generated, for example, by performing a join operation on the sources using a key field that is common to the sources. Records that are part of the join operation continue being input into queues of the profiling and processing subsystem 20 after the method is implemented for the initial portion of the data set.);
assigning records of the second plurality of records to individual data partitions of the set of data partitions at the at least one worker node (see e.g., [0125] for the partition by round-robin component 612 taking records from the single or multiple partitions of the input data set 402 and re-partitioning the records among a number of parallel processors and/or computers (e.g., as selected by the user) in order to balance the work load among the processors and/or computers. The profiling and processing subsystem assigns the records to partitions.);
combining records [census elements] across partitions within the set of data partitions, wherein combining records across partitions within the set of data partitions combines records sharing a second field value in a second particular partition (see e.g., [0126] for the canonicalize component 616 taking in a flow of records and sending out a flow of census elements containing a field/value pair representing values for each field in an input record, an input record with ten fields yielding a flow of ten census elements and the census elements flowing into a local rollup field/value component which (for each partition) takes occurrences of the same value for the same field and combines them into one census element including a count of the number of occurrences and [0128] for the partition by field/value component 624 re-partitioning the census elements by field and value. Census elements across the partitions that share a field value are combined in a particular partition.);
combining the records sharing the second field value in the second particular partition into an individual record [census element] having the second field value (see e.g., [0128] for the partition by field/value component 624 re-partitioning the census elements by field and value so that the rollup process performed in the global rollup field/value component 626 can add the occurrences calculated in different partitions to produce a total occurrences count in a single census element for each unique field/value pair contained within the profiled records. Census elements sharing the field value in the particular partition are combined into a single census element having the field value.); and
reducing the set of data partitions by aggregating records of the second particular partition with records of another partition (see e.g., [0128] for the global rollup field/value component 626 processing these census elements in potentially multiple partitions for a potentially parallel file represented by the census file component 410 and [0129] for a partition by field component 632 reading a flow of census elements from the census file component 410 and re-partitioning the census elements according to a hash value based on the field such that census records with the same field (but different values) are in the same partition. Census elements of the particular partition are aggregated with census elements of an additional partition having the same field but different value. This reduces the number of partitions since there are fewer fields than values per field.).
Gould does not specifically disclose based on a number of data partitions satisfying a threshold value, combining records across partitions within the set of data partitions; and reducing the set of data partitions by removing the second particular partition from the at least one worker node. However, Skjolsvold teaches
based on a number of data partitions satisfying a threshold value, combining records across partitions within the set of data partitions (see e.g., [0063] for merging of partitions allowing partitions that have lower amounts of activity to be combined, this reducing the overhead required to track and maintain the various partitions for a data set, optionally, a user defining an upper limit on the number of partitions for a namespace, the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit, and the upper limit for number of partitions being dynamically configured and [0080] for it being often desirable to avoid having too many partitions, as the maximum number of partitions is approached for a server, the likelihood of merging partitions increasing, and as an example, it being desirable to maintain between 5 and 8 partitions per server. Based on the number of data partitions exceeding a value approaching the upper limit, partitions may be combined.); and
reducing the set of data partitions by removing the second particular partition from the at least one worker node (see e.g., [0083] for when partitions are merged, as an initial step the partitions for merger being unassigned from the current server, for example, a first partition on server S2 having a low key value of K and a high key value of M, in this example, the epoch number for the first partition being 7, a second partition on server S4 having a low key value of M and a high key value of N, the epoch value for the second partition being 9 in this example, as an initial step, the partitions being unassigned from their respective servers, so that the partition table shows a non-assigned value for the server, the two partition entries being then replaced with a single entry having a low key of K and a high key of N, the epoch number assigned to this partition being one greater than the highest value of the merged partitions, which corresponds to 10 in this example, and the new partition then being assigned to a server. The number of partitions is reduced by removing the first partition from the current server.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to based on a number of data partitions satisfying a threshold value, combine records across partitions within the set of data partitions; and reduce the set of data partitions by removing the second particular partition from the at least one worker node, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

As to claim 8, the limitations of parent claim 1 have been discussed above. Gould teaches
wherein each record of the plurality of records reflects one or more events [occurrence of a field attaining a particular value] detected within raw machine data (see e.g., [0113] for reading individual work elements (e.g., individual records) from raw data in a data system, the runtime environment providing access to a physical computer-readable storage medium (e.g., a magnetic, optical, or magneto-optical medium) as a string of raw data bits (e.g., mounted in a file system or flowing over a network connection), and the import component accessing a DML file to determine how to read and interpret the raw data in order to generate a flow of work elements and [0116] for a component interpreting a block of raw data to extract values for all of the fields of a record. Each record reflects occurrences of fields attaining particular values detected within raw machine data.).

As to claim 9, the limitations of parent claim 1 have been discussed above. Gould teaches
wherein each record of the plurality of records reflects one or more events [occurrence of a field attaining a particular value] detected within raw machine data, and wherein a chunk of data is obtained from an indexer device [import component] configured to generate a record from the one or more events (see e.g., [0106] for an import component implementing the portion of the profiling module 100 that can interpret the data format of a wide variety of data systems, [0113] for reading individual work elements (e.g., individual records) from raw data in a data system, the runtime environment providing access to a physical computer-readable storage medium (e.g., a magnetic, optical, or magneto-optical medium) as a string of raw data bits (e.g., mounted in a file system or flowing over a network connection), and the import component accessing a DML file to determine how to read and interpret the raw data in order to generate a flow of work elements and [0116] for a component interpreting a block of raw data to extract values for all of the fields of a record. Each record reflects occurrences of fields attaining particular values detected within raw machine data. The initial portion of the data set is obtained from the import component, which is configured to generate the record form occurrences of fields attaining particular values.).

As to claim 10, the limitations of parent claim 1 have been discussed above. Gould teaches 
wherein the first partition includes records obtained from multiple different chunks of data [data sources] (see e.g., [0072] for a data processing system 10 including a profiling and processing subsystem 20, which is used to process data from data sources 30, [0076] for the profiling and processing subsystem 20 including a profiling module 100, which reads data directly from a data source without necessarily landing a complete copy of the data to a storage medium before profiling in units of discrete work elements such as individual records, [0073] for data sources 30 in general including a variety of individual data sources, each of which may have unique storage formats and interfaces (for example, database tables, spreadsheet files, flat text files, or a native format used by a mainframe 110), and  [0079] for the profiling module 100 reading records from a data source. The first partition includes census elements obtained from multiple different data sources).

As to claim 11, the limitations of parent claim 7 have been discussed above. Gould teaches 
prior to combining records across partitions within the set of data partitions, combining records in each partition that have shared field values (see e.g., [0126] for the canonicalize component 616 taking in a flow of records and sending out a flow of census elements containing a field/value pair representing values for each field in an input record, an input record with ten fields yielding a flow of ten census elements and the census elements flowing into a local rollup field/value component which (for each partition) takes occurrences of the same value for the same field and combines them into one census element including a count of the number of occurrences. Records in each partition that have shared field values are combined into one census element.). 

As to claim 12, the limitations of parent claim 7 have been discussed above. Gould does not specifically disclose wherein the number of data partitions is a number of data partitions at the at least one worker node. However, Skjolsvold teaches
wherein the number of data partitions is a number of data partitions at the at least one worker node (see e.g., [0018] for the union of all partitions spanning the entire domain or namespace, [0020] for a "partition master" being a role that manages partition servers for a given type of role, such as by assigning and unassigning partitions to partition servers, and [0063] for a user defining an upper limit on the number of partitions for a namespace, the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit, and the upper limit for number of partitions being dynamically configured. The number of partitions refers to the number of partitions managed by the partition master.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein the number of data partitions is a number of data partitions at the at least one worker node, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

As to claim 13, the limitations of parent claim 7 have been discussed above. Gould teaches
wherein the at least one worker node is associated with a distributed query execution environment (see e.g., [0093] for the runtime environment providing for the profiling module 100 to execute as a parallel process and parallel processing systems including any configuration of computer systems using multiple central processing units (CPUs), either locally distributed or remotely distributed).
Gould does not specifically disclose wherein the at least one worker node is one of a plurality of worker nodes within the distributed query execution environment, and wherein the number of data partitions is a number of data partitions across the plurality of worker nodes. However, Skjolsvold teaches
wherein the at least one worker node [dictator] is one of a plurality of worker nodes within the distributed query execution environment (see e.g., [0024] for partition masters for a given type of role being preferably redundant, so that at least one additional partition master is available if a failure occurs and a "dictator" being defined as the partition master that current performs the partition master functions for a given type of role. The dictator is one of a plurality of partition masters.), and
wherein the number of data partitions is a number of data partitions across the plurality of worker nodes (see e.g., [0018] for the union of all partitions spanning the entire domain or namespace, [0020] for a "partition master" being a role that manages partition servers for a given type of role, such as by assigning and unassigning partitions to partition servers, and [0063] for a user defining an upper limit on the number of partitions for a namespace, the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit, and the upper limit for number of partitions being dynamically configured. The number of partitions refers to the number of partitions across the dictator and additional partition masters provided for redundancy.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein the at least one worker node is one of a plurality of worker nodes within the distributed query execution environment, and wherein the number of data partitions is a number of data partitions across the plurality of worker nodes, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

As to claim 14, the limitations of parent claim 7 have been discussed above. Gould teaches
wherein the at least one worker node is associated with a distributed query execution environment (see e.g., [0093] for the runtime environment providing for the profiling module 100 to execute as a parallel process and parallel processing systems including any configuration of computer systems using multiple central processing units (CPUs), either locally distributed or remotely distributed).
Gould does not specifically disclose wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises obtaining the number of data partitions from the search master. However, Skjolsvold teaches
wherein the distributed query execution environment includes a search master [partition table] configured to track the number of data partitions, and wherein the method further comprises obtaining the number of data partitions from the search master (see e.g., [0030] for a partition table being used to track the current assignments of partitions to partition servers, when an active master or dictator assigns a partition to a server, the partition table being updated first to reflect the assignment, the partition table then being used to determine the partition server that will handle a client request based on the key specified in the client request, as an example, an entry in a partition table including the low key for a range, the high key for the range, and the role instance or server instance that will perform a requested task on the data or state corresponding to requested key, and a partition table also including other data, such as an epoch number or version number, [0039] for each partition having a current epoch number that is updated by the master when a change occurs for the partition, examples of changes for a partition including assignment of a partition to a new server, splitting of a partition, and merging of two partitions, assignments of a partition to a new server causing the epoch number to increase by one, splitting of a partition into two or more new partitions causing each child partition to receive the parent's epoch number incremented by one, and when two partitions are merged, the epoch number for the merged partition being the maximum epoch number for any of the partitions prior to merge incremented by one, [0042] for the information in partition table 455 being populated based on the partition decisions made by partition master 460, and [0063] for the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit. The partition table tracks the number of partitions by tracking the current assignments of partitions to partition servers. Prior to merging, the highest epoch number of the partitions indicates the number of partitions. The partitions master obtains the number of partitions from the partition table in order to determine if the number is approaching an upper limit.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises obtaining the number of data partitions from the search master, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

As to claim 15, the limitations of parent claim 7 have been discussed above. Gould teaches
wherein the at least one worker node is associated with a distributed query execution environment (see e.g., [0093] for the runtime environment providing for the profiling module 100 to execute as a parallel process and parallel processing systems including any configuration of computer systems using multiple central processing units (CPUs), either locally distributed or remotely distributed).
Gould does not specifically disclose wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises reporting the number of data partitions to the search master. However, Skjolsvold teaches
wherein the distributed query execution environment includes a search master [partition table] configured to track the number of data partitions, and wherein the method further comprises reporting the number of data partitions to the search master (see e.g., [0030] for a partition table being used to track the current assignments of partitions to partition servers, when an active master or dictator assigns a partition to a server, the partition table being updated first to reflect the assignment, the partition table then being used to determine the partition server that will handle a client request based on the key specified in the client request, as an example, an entry in a partition table including the low key for a range, the high key for the range, and the role instance or server instance that will perform a requested task on the data or state corresponding to requested key, and a partition table also including other data, such as an epoch number or version number, [0039] for each partition having a current epoch number that is updated by the master when a change occurs for the partition, examples of changes for a partition including assignment of a partition to a new server, splitting of a partition, and merging of two partitions, assignments of a partition to a new server causing the epoch number to increase by one, splitting of a partition into two or more new partitions causing each child partition to receive the parent's epoch number incremented by one, and when two partitions are merged, the epoch number for the merged partition being the maximum epoch number for any of the partitions prior to merge incremented by one, [0042] for the information in partition table 455 being populated based on the partition decisions made by partition master 460, and [0063] for the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit. The partition table tracks the number of partitions by tracking the current assignments of partitions to partition servers. Prior to merging, the highest epoch number of the partitions indicates the number of partitions. The partition master reports each partition and partition epoch number to the partition table.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises reporting the number of data partitions to the search master, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

As to claim 16, the limitations of parent claim 7 have been discussed above. Gould teaches
wherein the at least one worker node is associated with a distributed query execution environment (see e.g., [0093] for the runtime environment providing for the profiling module 100 to execute as a parallel process and parallel processing systems including any configuration of computer systems using multiple central processing units (CPUs), either locally distributed or remotely distributed).
Gould does not specifically disclose wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises reporting the number of data partitions to the search master and obtaining the number of data partitions from the search master in response to the reporting. However, Skjolsvold teaches
wherein the distributed query execution environment includes a search master [partition table] configured to track the number of data partitions, and wherein the method further comprises reporting the number of data partitions to the search master and obtaining the number of data partitions from the search master in response to the reporting (see e.g., [0030] for a partition table being used to track the current assignments of partitions to partition servers, when an active master or dictator assigns a partition to a server, the partition table being updated first to reflect the assignment, the partition table then being used to determine the partition server that will handle a client request based on the key specified in the client request, as an example, an entry in a partition table including the low key for a range, the high key for the range, and the role instance or server instance that will perform a requested task on the data or state corresponding to requested key, and a partition table also including other data, such as an epoch number or version number, [0039] for each partition having a current epoch number that is updated by the master when a change occurs for the partition, examples of changes for a partition including assignment of a partition to a new server, splitting of a partition, and merging of two partitions, assignments of a partition to a new server causing the epoch number to increase by one, splitting of a partition into two or more new partitions causing each child partition to receive the parent's epoch number incremented by one, and when two partitions are merged, the epoch number for the merged partition being the maximum epoch number for any of the partitions prior to merge incremented by one, [0042] for the information in partition table 455 being populated based on the partition decisions made by partition master 460, and [0063] for the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit. The partition table tracks the number of partitions by tracking the current assignments of partitions to partition servers. Prior to merging, the highest epoch number of the partitions indicates the number of partitions. The partition master reports each partition and partition epoch number to the partition table. The partitions master obtains, in response to reporting additional partitions, the number of partitions from the partition table in order to determine if the number is approaching an upper limit.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises reporting the number of data partitions to the search master and obtaining the number of data partitions from the search master in response to the reporting, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

As to claim 21, the limitations of parent claim 1 have been discussed above. Gould teaches
wherein the query is associated with multiple chunks of data, and wherein the method is implemented prior to one or more chunks of data being obtained at the at least one worker node (see e.g., [0094] for a flow control mechanism being implemented using input queues for the links entering a component, [0095] for when two components are connected by a flow, the upstream component sending work elements to the downstream component as long as the downstream component keeps consuming the work elements and if the downstream component falls behind, the upstream component filling up the input queue of the downstream component and stop working until the input queue clears out again, and [0203] for profiling module 100 generating a "virtual table" that includes fields from the multiple sources and the virtual table being generated, for example, by performing a join operation on the sources using a key field that is common to the sources. Records that are part of the join operation continue being input into queues of the profiling and processing subsystem 20 after the method is implemented for the initial portion of the data set.). 

As to claim 22, the limitations of parent claim 1 have been discussed above. Gould teaches
wherein the field value is derived from a combination of fields of the plurality of records (see e.g., [0180] for a functional dependency relationship existing among a subset of fields where the value associated with one field of a record can be uniquely determined by the values associated with other fields of the record and for example, the value of the Zip Code field being uniquely determined by the values of a City field and a Street field. The Zip Code field is derived from a combination of City and Street fields.).

As to claim 23, the limitations of parent claim 1 have been discussed above. Gould in view of Skjolsvold does not specifically disclose wherein reducing the number of data partitions by aggregating records of the first partition with records of the second partition comprises selecting the first partition for aggregation based on a number of records within the first partition. However, Bellamkonda teaches
wherein reducing the number of data partitions by aggregating records of the first partition [partition selected to be spilled to disk] with records of the second partition comprises selecting the first partition for aggregation  based on a number of records [entries storing aggregations] within the first partition (see e.g., [0036] for each entry in a partition storing a running tally of the measure being aggregated and for example, if a query specifying sum of sales with group by customer ID is issued against table 102, then during processing of the query, an entry that includes a running summation of the sales values being stored in volatile memory in association with the corresponding partition 108a-108d, for each unique value of cust_id and [0060] for once a partition is spilled to persistent storage, then data items for that partition that are processed later (i.e., after that partition is spilled) no longer being aggregated on-the-fly, but simply being hashed and stored in volatile memory slots corresponding to the partition, and selecting the "victim" partition, i.e., the partition that gets spilled to disk, being generally based on the following, a partition that has already been spilled to disk being selected so as not to increase the number of partitions to be processed in phase two, and to be able to keep aggregating the partitions that are in memory. Aggregating records of the partition to be spilled to disk with records of the persistently stored partition includes selecting the partition to be spilled to disk based on the partition to be spilled to disk having zero entries storing aggregations.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould in view of Skjolsvold wherein reducing the number of data partitions by aggregating records of the first partition with records of the second partition comprises selecting the first partition for aggregation based on a number of records within the first partition, as taught by Bellamkonda, for the benefit of freeing the maximum amount of full memory (see e.g., Bellamkonda. [0058]).

As to claim 24, the limitations of parent claim 1 have been discussed above. Gould in view of Skjolsvold does not specifically disclose wherein reducing the number of data partitions by aggregating records of the first partition with records of the second partition comprises selecting the first partition for aggregation based on the first partition having a minimum number of records compared to other partitions of the set of data partitions. However, Bellamkonda teaches
wherein reducing the number of data partitions by aggregating records of the first partition [partition selected to be spilled to disk] with records of the second partition comprises selecting the first partition for aggregation based on the first partition having a minimum number of records [entries storing aggregations] compared to other partitions of the set of data partitions (see e.g., [0036] for each entry in a partition storing a running tally of the measure being aggregated and for example, if a query specifying sum of sales with group by customer ID is issued against table 102, then during processing of the query, an entry that includes a running summation of the sales values being stored in volatile memory in association with the corresponding partition 108a-108d, for each unique value of cust_id and [0060] for once a partition is spilled to persistent storage, then data items for that partition that are processed later (i.e., after that partition is spilled) no longer being aggregated on-the-fly, but simply being hashed and stored in volatile memory slots corresponding to the partition, and selecting the "victim" partition, i.e., the partition that gets spilled to disk, being generally based on the following, a partition that has already been spilled to disk being selected so as not to increase the number of partitions to be processed in phase two, and to be able to keep aggregating the partitions that are in memory. Aggregating records of the partition to be spilled to disk with records of the persistently stored partition includes selecting the partition to be spilled to disk based on the partition to be spilled to disk having zero entries storing aggregations.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould in view of Skjolsvold wherein reducing the number of data partitions by aggregating records of the first partition with records of the second partition comprises selecting the first partition for aggregation based on the first partition having a minimum number of records compared to other partitions of the set of data partitions, as taught by Bellamkonda, for the benefit of freeing the maximum amount of full memory (see e.g., Bellamkonda. [0058]).

As to claim 25, the limitations of parent claim 1 have been discussed above. Gould in view of Skjolsvold does not specifically disclose wherein reducing the number of data partitions by aggregating records of the first partition with records of the second partition comprises selecting the second partition for aggregation based the second partition having a highest number of records, compared to other partitions of the set of data partitions, that does not exceed a maximum number of records allowable within the second partition. However, Bellamkonda teaches
wherein reducing the number of data partitions by aggregating records of the first partition [partition selected to be spilled to disk] with records of the second partition comprises selecting the second partition for aggregation based the second partition having a highest number of records [entries], compared to other partitions of the set of data partitions, that does not exceed a maximum number of records [maximum number of entries capable of being stored in the volatile memory available] allowable within the second partition (see e.g., [0058] for while processing the input data items, if a data item is read, for which there is not enough volatile memory available to store the corresponding entry in the partition, then a partition being selected to spill to persistent storage (e.g., disk) and in one embodiment, the largest partition, i.e., the partition having the most volatile memory slots, being selected for spillage to persistent storage and [0062] for if no partition has been spilled to disk yet, then the largest in-memory partition being chosen so that more slots can be freed at once and intuitively, more records are going to get inserted into the largest partition because of skew in data, and this partition will eventually have to spill it to disk due to lack of memory. Aggregating records of the partition to be spilled to disk with records of the persistently stored partition includes selecting the persistently stored partition based on the persistently stored partition having a highest number of entries compared to other partitions storing aggregations, that does not exceed the maximum number of entries capable of being stored in the volatile memory available.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould in view of Skjolsvold wherein reducing the number of data partitions by aggregating records of the first partition with records of the second partition comprises selecting the second partition for aggregation based the second partition having a highest number of records, compared to other partitions of the set of data partitions, that does not exceed a maximum number of records allowable within the second partition, as taught by Bellamkonda, for the benefit of freeing the maximum amount of full memory (see e.g., Bellamkonda. [0058]).

As to claim 26, Gould teaches a system implementing a worker node [profiling and processing subsystem], the system comprising: 
a data store [data storage system] including computer-executable instructions [procedures] (see e.g., [0207] for the software forming procedures in one or more computer programs that execute on one or more programmed or programmable computer systems (which may be of various architectures, such as distributed, client/server, or grid) each including at least one processor, at least one data storage system (for example, volatile and non-volatile memory and/or storage elements), at least one input device or port, and at least one output device or port); and 
a processor in communication with the data store and configured to execute the computer- executable instructions (see e.g., [0207] for the software forming procedures in one or more computer programs that execute on one or more programmed or programmable computer systems (which may be of various architectures, such as distributed, client/server, or grid) each including at least one processor, at least one data storage system (for example, volatile and non-volatile memory and/or storage elements), at least one input device or port, and at least one output device or port) to:
obtain a plurality of records associated with a query (see e.g., [0072] for a data processing system 10 including a profiling and processing subsystem 20, which is used to process data from data sources 30, [0076] for the profiling and processing subsystem 20 including a profiling module 100, which reads data directly from a data source without necessarily landing a complete copy of the data to a storage medium before profiling in units of discrete work elements such as individual records, [0087] for the profiling module 100 obtaining an initial portion of the data set, [0079] for the profiling module 100 reading records from a data source, and [0203] for profiling module 100 generating a "virtual table" that includes fields from the multiple sources and the virtual table being generated, for example, by performing a join operation on the sources using a key field that is common to the sources. The profiling and processing subsystem obtains records associated with a join query.);
assign records of the plurality of records to individual data partitions of a set of data partitions at the worker node, wherein individual partitions of the set of data partitions correspond to distinct portions of physical data storage [parallel processors and/or computers] of the worker node (see e.g., [0125] for the partition by round-robin component 612 taking records from the single or multiple partitions of the input data set 402 and re-partitioning the records among a number of parallel processors and/or computers (e.g., as selected by the user) in order to balance the work load among the processors and/or computers. The profiling and processing subsystem assigns the records to partitions. The partitions correspond to distinct processors and/or computers.); and
reduce a number of partitions in the set of data partitions by:
aggregating records [census elements] of a first partition with records [census elements] of a second partition by relocating at least a first record having a field value from the distinct portion of physical data storage corresponding to the first partition to the distinct portion of physical data storage corresponding to the second partition (see e.g., [0126] for the canonicalize component 616 taking in a flow of records and sending out a flow of census elements containing a field/value pair representing values for each field in an input record, an input record with ten fields yielding a flow of ten census elements and the census elements flowing into a local rollup field/value component which (for each partition) takes occurrences of the same value for the same field and combines them into one census element including a count of the number of occurrences, [0128] for the partition by field/value component 624 re-partitioning the census elements by field and value so that the rollup process performed in the global rollup field/value component 626 can add the occurrences calculated in different partitions to produce a total occurrences count in a single census element for each unique field/value pair contained within the profiled records and the global rollup field/value component 626 processing these census elements in potentially multiple partitions for a potentially parallel file represented by the census file component 410, and [0129] for a partition by field component 632 reading a flow of census elements from the census file component 410 and re-partitioning the census elements according to a hash value based on the field such that census records with the same field (but different values) are in the same partition. Census elements of a first partition are aggregated with census elements of a second partition by relocating at least a first census element having a field value from the first partition to the second partition. The number of partitions is then reduced when census records with the same field but different values are merged into a single partition.).
Gould does not specifically disclose reducing a number of partitions in the set of data partitions by: removing the first partition from the worker node. However, Skjolsvold teaches reducing a number of partitions in the set of data partitions by:
removing the first partition from the worker node [current server] (see e.g., [0083] for when partitions are merged, as an initial step the partitions for merger being unassigned from the current server, for example, a first partition on server S2 having a low key value of K and a high key value of M, in this example, the epoch number for the first partition being 7, a second partition on server S4 having a low key value of M and a high key value of N, the epoch value for the second partition being 9 in this example, as an initial step, the partitions being unassigned from their respective servers, so that the partition table shows a non-assigned value for the server, the two partition entries being then replaced with a single entry having a low key of K and a high key of N, the epoch number assigned to this partition being one greater than the highest value of the merged partitions, which corresponds to 10 in this example, and the new partition then being assigned to a server. The number of partitions is reduced by removing the first partition from the current server.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to reduce a number of partitions in the set of data partitions by: removing the first partition from the worker node, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).
Gould in view of Skjolsvold does not specifically disclose wherein the second partition has a highest number of records sharing the field value, among the set of data partitions. However, Bellamkonda teaches
wherein the second partition [persistently stored partition] has a highest number of records [entries] sharing the field value [customer ID value], among the set of data partitions [partitions storing aggregations] (see e.g., [0003] for data in a relational database management system (RDBMS) being aggregated in response to a query, such as a SQL query, that includes an aggregation function (e.g., SUM, COUNT, AVG, etc.) with a GROUP BY clause, [0035] for if the query that specifies the aggregation operation specifies grouping by customer ID (e.g., GROUPBY cust_id), then the value in the customer ID field being input to the hash function 104, which generates a hash value and the hash function being constructed so that the hash values are determinative of to which partition 108a-108d each data item will be partitioned, [0036] for each entry in a partition storing a running tally of the measure being aggregated and for example, if a query specifying sum of sales with group by customer ID is issued against table 102, then during processing of the query, an entry that includes a running summation of the sales values being stored in volatile memory in association with the corresponding partition 108a-108d, for each unique value of cust_id, [0058] for while processing the input data items, if a data item is read, for which there is not enough volatile memory available to store the corresponding entry in the partition, then a partition being selected to spill to persistent storage (e.g., disk) and in one embodiment, the largest partition, i.e., the partition having the most volatile memory slots, being selected for spillage to persistent storage, [0060] for once a partition is spilled to persistent storage, then data items for that partition that are processed later (i.e., after that partition is spilled) no longer being aggregated on-the-fly, but simply being hashed and stored in volatile memory slots corresponding to the partition, and selecting the "victim" partition, i.e., the partition that gets spilled to disk, being generally based on the following, a partition that has already been spilled to disk being selected so as not to increase the number of partitions to be processed in phase two, and to be able to keep aggregating the partitions that are in memory. Since a partition that has already been spilled to disk is selected to be spilled to disk again, the persistently stored partition has, among the partitions storing aggregations, the highest number of entries sharing the same customer ID value as the partition selected to be spilled to disk.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould in view of Skjolsvold wherein the second partition has a highest number of records sharing the field value, among the set of data partitions, as taught by Bellamkonda, for the benefit of freeing the maximum amount of full memory (see e.g., Bellamkonda. [0058]).

As to claim 27, the limitations of parent claim 26 have been discussed above. Gould does not specifically disclose wherein the set of data partitions is a first group of partitions, and wherein the worker node maintains a plurality of groups of partitions, each group of partitions associated with a subset of potential field values. However, Skjolsvold teaches
wherein the set of data partitions is a first group of partitions, and wherein the worker node [partition master] maintains a plurality of groups of partitions [partitions assigned to partition servers], each group of partitions associated with a subset of potential field values [keys]  (see e.g., [0017] for a key being a value from a namespace or domain, an example of a namespace being an identifier corresponding to all storage accounts in a cloud computing environment, and in such an example, a key corresponding to an account name, account number, or another identifier that allows a specific account to be referenced, [0018] for a "partition" being a range defined by a low (inclusive) and high (exclusive) key, [0019] for a "partition server" being a virtual machine within a cloud computing environment that corresponds to a role instance for serving zero or more partitions, [0020] for a "partition master" being a role that manages partition servers for a given type of role, such as by assigning and unassigning partitions to partition servers, and [0029] for based on a namespace, a computation being partitioned so that multiple partition servers handle or perform different portions of processing for the namespace, each partition corresponding to a range of key values, and when a partition is assigned to a partition server, the server performing the desired computation for any requests that contain a key value within the range corresponding to an assigned partition. The partition master maintains a plurality of groups of partitions assigned to partition servers. Each group of partitions assigned to a partition server is associated with a subset of keys of an identifier.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein the set of data partitions is a first group of partitions, and wherein the worker node maintains a plurality of groups of partitions, each group of partitions associated with a subset of potential field values, as taught by Skjolsvold, for the benefit of providing a framework for handling features such as scalability, fault tolerance, and/or availability while reducing or minimizing the amount of effort required to address these features (see e.g., Skjolsvold, [0016]).

As to claim 28, the limitations of parent claim 26 have been discussed above. Gould does not specifically disclose the set of data partitions is a first group of partitions, wherein the worker node maintains a plurality of groups of partitions, and wherein a number of the groups of partitions is equal to a number of processor cores of the worker node. However, Skjolsvold teaches
the set of data partitions is a first group of partitions, wherein the worker node [partition master] maintains a plurality of groups of partitions [partitions assigned to partition servers], and wherein a number of the groups of partitions is equal to a number of processor cores [partition servers] of the worker node (see e.g., [0017] for a key being a value from a namespace or domain, an example of a namespace being an identifier corresponding to all storage accounts in a cloud computing environment, and in such an example, a key corresponding to an account name, account number, or another identifier that allows a specific account to be referenced, [0018] for a "partition" being a range defined by a low (inclusive) and high (exclusive) key, [0019] for a "partition server" being a virtual machine within a cloud computing environment that corresponds to a role instance for serving zero or more partitions, [0020] for a "partition master" being a role that manages partition servers for a given type of role, such as by assigning and unassigning partitions to partition servers, and [0029] for based on a namespace, a computation being partitioned so that multiple partition servers handle or perform different portions of processing for the namespace, each partition corresponding to a range of key values, and when a partition is assigned to a partition server, the server performing the desired computation for any requests that contain a key value within the range corresponding to an assigned partition. The partition master maintains a plurality of groups of partitions assigned to partition servers. The number of groups of partitions assigned to partition servers is equal to the number of partition servers.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould the set of data partitions is a first group of partitions, wherein the worker node maintains a plurality of groups of partitions, and wherein a number of the groups of partitions is equal to a number of processor cores of the worker node, as taught by Skjolsvold, for the benefit of providing a framework for handling features such as scalability, fault tolerance, and/or availability while reducing or minimizing the amount of effort required to address these features (see e.g., Skjolsvold, [0016]).

As to claim 29, Gould teaches non-transitory computer-readable media comprising computer-executable instructions (see e.g., [0208] for each such computer program being preferably stored on or downloaded to a storage media or device (e.g., solid state memory or media, or magnetic or optical media) readable by a general or special purpose programmable computer, for configuring and operating the computer when the storage media or device is read by the computer system to perform the procedures described herein) that, when executed by a worker node [profiling and processing subsystem] , cause the worker node to:
obtain a plurality of records associated with a query (see e.g., [0072] for a data processing system 10 including a profiling and processing subsystem 20, which is used to process data from data sources 30, [0076] for the profiling and processing subsystem 20 including a profiling module 100, which reads data directly from a data source without necessarily landing a complete copy of the data to a storage medium before profiling in units of discrete work elements such as individual records, [0087] for the profiling module 100 obtaining an initial portion of the data set, [0079] for the profiling module 100 reading records from a data source, and [0203] for profiling module 100 generating a "virtual table" that includes fields from the multiple sources and the virtual table being generated, for example, by performing a join operation on the sources using a key field that is common to the sources. The profiling and processing subsystem obtains records associated with a join query.);
assign records of the plurality of records to individual data partitions of a set of data partitions at the worker node, wherein individual partitions of the set of data partitions correspond to distinct portions of physical data storage [parallel processors and/or computers] of the worker node (see e.g., [0125] for the partition by round-robin component 612 taking records from the single or multiple partitions of the input data set 402 and re-partitioning the records among a number of parallel processors and/or computers (e.g., as selected by the user) in order to balance the work load among the processors and/or computers. The profiling and processing subsystem assigns the records to partitions. The partitions correspond to distinct processors and/or computers.); and
reduce a number of partitions in the set of data partitions by:
aggregating records [census elements] of a first partition with records [census elements] of a second partition by relocating at least a first record having a field value from the distinct portion of physical data storage corresponding to the first partition to the distinct portion of physical data storage corresponding to the second partition (see e.g., [0126] for the canonicalize component 616 taking in a flow of records and sending out a flow of census elements containing a field/value pair representing values for each field in an input record, an input record with ten fields yielding a flow of ten census elements and the census elements flowing into a local rollup field/value component which (for each partition) takes occurrences of the same value for the same field and combines them into one census element including a count of the number of occurrences, [0128] for the partition by field/value component 624 re-partitioning the census elements by field and value so that the rollup process performed in the global rollup field/value component 626 can add the occurrences calculated in different partitions to produce a total occurrences count in a single census element for each unique field/value pair contained within the profiled records and the global rollup field/value component 626 processing these census elements in potentially multiple partitions for a potentially parallel file represented by the census file component 410, and [0129] for a partition by field component 632 reading a flow of census elements from the census file component 410 and re-partitioning the census elements according to a hash value based on the field such that census records with the same field (but different values) are in the same partition. Census elements of a first partition are aggregated with census elements of a second partition by relocating at least a first census element having a field value from the first partition to the second partition. The number of partitions is then reduced when census records with the same field but different values are merged into a single partition.).
Gould does not specifically disclose reducing a number of partitions in the set of data partitions by: removing the first partition from the worker node. However, Skjolsvold teaches reducing a number of partitions in the set of data partitions by:
removing the first partition from the worker node [current server] (see e.g., [0083] for when partitions are merged, as an initial step the partitions for merger being unassigned from the current server, for example, a first partition on server S2 having a low key value of K and a high key value of M, in this example, the epoch number for the first partition being 7, a second partition on server S4 having a low key value of M and a high key value of N, the epoch value for the second partition being 9 in this example, as an initial step, the partitions being unassigned from their respective servers, so that the partition table shows a non-assigned value for the server, the two partition entries being then replaced with a single entry having a low key of K and a high key of N, the epoch number assigned to this partition being one greater than the highest value of the merged partitions, which corresponds to 10 in this example, and the new partition then being assigned to a server. The number of partitions is reduced by removing the first partition from the current server.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to reduce a number of partitions in the set of data partitions by: removing the first partition from the worker node, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).
Gould in view of Skjolsvold does not specifically disclose wherein the second partition has a highest number of records sharing the field value, among the set of data partitions. However, Bellamkonda teaches
wherein the second partition [persistently stored partition] has a highest number of records [entries] sharing the field value [customer ID value], among the set of data partitions [partitions storing aggregations] (see e.g., [0003] for data in a relational database management system (RDBMS) being aggregated in response to a query, such as a SQL query, that includes an aggregation function (e.g., SUM, COUNT, AVG, etc.) with a GROUP BY clause, [0035] for if the query that specifies the aggregation operation specifies grouping by customer ID (e.g., GROUPBY cust_id), then the value in the customer ID field being input to the hash function 104, which generates a hash value and the hash function being constructed so that the hash values are determinative of to which partition 108a-108d each data item will be partitioned, [0036] for each entry in a partition storing a running tally of the measure being aggregated and for example, if a query specifying sum of sales with group by customer ID is issued against table 102, then during processing of the query, an entry that includes a running summation of the sales values being stored in volatile memory in association with the corresponding partition 108a-108d, for each unique value of cust_id, [0058] for while processing the input data items, if a data item is read, for which there is not enough volatile memory available to store the corresponding entry in the partition, then a partition being selected to spill to persistent storage (e.g., disk) and in one embodiment, the largest partition, i.e., the partition having the most volatile memory slots, being selected for spillage to persistent storage, [0060] for once a partition is spilled to persistent storage, then data items for that partition that are processed later (i.e., after that partition is spilled) no longer being aggregated on-the-fly, but simply being hashed and stored in volatile memory slots corresponding to the partition, and selecting the "victim" partition, i.e., the partition that gets spilled to disk, being generally based on the following, a partition that has already been spilled to disk being selected so as not to increase the number of partitions to be processed in phase two, and to be able to keep aggregating the partitions that are in memory. Since a partition that has already been spilled to disk is selected to be spilled to disk again, the persistently stored partition has, among the partitions storing aggregations, the highest number of entries sharing the same customer ID value as the partition selected to be spilled to disk.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould in view of Skjolsvold wherein the second partition has a highest number of records sharing the field value, among the set of data partitions, as taught by Bellamkonda, for the benefit of freeing the maximum amount of full memory (see e.g., Bellamkonda. [0058]).

As to claim 30, the limitations of parent claim 29 have been discussed above. Gould does not specifically disclose wherein the set of data partitions is a first group of partitions, and wherein the worker node maintains a plurality of groups of partitions, each group of partitions associated with a subset of potential field values. However, Skjolsvold teaches
wherein the set of data partitions is a first group of partitions, and wherein the worker node [partition master] maintains a plurality of groups of partitions [partitions assigned to partition servers], each group of partitions associated with a subset of potential field values [keys]  (see e.g., [0017] for a key being a value from a namespace or domain, an example of a namespace being an identifier corresponding to all storage accounts in a cloud computing environment, and in such an example, a key corresponding to an account name, account number, or another identifier that allows a specific account to be referenced, [0018] for a "partition" being a range defined by a low (inclusive) and high (exclusive) key, [0019] for a "partition server" being a virtual machine within a cloud computing environment that corresponds to a role instance for serving zero or more partitions, [0020] for a "partition master" being a role that manages partition servers for a given type of role, such as by assigning and unassigning partitions to partition servers, and [0029] for based on a namespace, a computation being partitioned so that multiple partition servers handle or perform different portions of processing for the namespace, each partition corresponding to a range of key values, and when a partition is assigned to a partition server, the server performing the desired computation for any requests that contain a key value within the range corresponding to an assigned partition. The partition master maintains a plurality of groups of partitions assigned to partition servers. Each group of partitions assigned to a partition server is associated with a subset of keys of an identifier.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein the set of data partitions is a first group of partitions, and wherein the worker node maintains a plurality of groups of partitions, each group of partitions associated with a subset of potential field values, as taught by Skjolsvold, for the benefit of providing a framework for handling features such as scalability, fault tolerance, and/or availability while reducing or minimizing the amount of effort required to address these features (see e.g., Skjolsvold, [0016]).

Claims 5, 6, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Gould et al. (US Publication No. 2005/0102325) and Skjolsvold et al. (US Publication No. 2013/0204991) in view of Bellamkonda et al. (US Publication No. 2006/0116989) as applied to claims 1-4, 7-16, and 21-30 above, and further in view of Kim (US Publication No. 2020/0057818).

As to claim 5, the limitations of parent claim 1 have been discussed above. Gould and Skjolsvold in view of Bellamkonda does not specifically disclose wherein each data partition of the set of data partitions contains records received at the at least one worker node during a distinct time period. However, Kim teaches
wherein each data partition of the set of data partitions contains records received at the at least one worker node during a distinct time period (see e.g., [0071] for referring to FIG. 3, tag names appearing repetitively in several partitions, with real-time input, sensor data occurring continuously with respect to time, however, as regards time values, data being typically inputted sequentially from the past to the present, therefore, if the minimum and maximum values of time for a partition are maintained in the memory, it being possible to forego reading several partitions based on the condition of input time, when merging indexes, the minimum and maximum values for the time values being obtained and recorded in the partition header, and such information being maintained. Partition 0 contains records received from 00:00 to 00:10. Partition 1 contains records received from 00:10 to 00:20. Partition 2 contains records received from 00:20 to 00:30. Partition 3 contains records received from 00:40 to 00:50.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould and Skjolsvold in view of Bellamkonda wherein each data partition of the set of data partitions contains records received at the at least one worker node during a distinct time period, as taught by Kim, for the benefit of reducing the computational cost of searching time series data (see e.g., Kim, [0078]-[0079]).

As to claim 6, the limitations of parent claim 1 have been discussed above. Gould and Skjolsvold in view of Bellamkonda does not specifically disclose wherein assigning records of the plurality of records to individual data partitions of the set of data partitions at the at least one worker node comprises assigning records to an individual data partition of the set of data partitions until the individual data partition reaches a maximum number of records and then assigning records to a second individual data partition of the set of data partitions. However, Kim teaches
wherein assigning records of the plurality of records to individual data partitions of the set of data partitions at the at least one worker node comprises assigning records to an individual data partition of the set of data partitions until the individual data partition reaches a maximum number of records and then assigning records to a second individual data partition of the set of data partitions (see e.g., [0040] for the records being simply stored until the number of inputted records reaches the partition's maximum count and FIG. 2 and [0048] for the data partition size at the initial level being 4. Records 0-3 are assigned to Partition 0 and Records 4-7 are assigned to Partition 1.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould and Skjolsvold in view of Bellamkonda wherein assigning records of the plurality of records to individual data partitions of the set of data partitions at the at least one worker node comprises assigning records to an individual data partition of the set of data partitions until the individual data partition reaches a maximum number of records and then assigning records to a second individual data partition of the set of data partitions, as taught by Kim, for the benefit of reducing the computational cost of searching time series data (see e.g., Kim, [0078]-[0079]).

As to claim 20, the limitations of parent claim 1 have been discussed above. Gould and Skjolsvold in view of Bellamkonda does not specifically disclose combining, within the second partition, two or more records sharing the field value into an individual record having the field value; and reducing the set of data partitions by aggregating records of the second partition with records of a third partition and removing the second partition from the at least one worker node. However, Kim teaches
combining, within the second partition [Block 0-1], two or more records sharing the field [tag] value into an individual record having the field value (see e.g., [0049] for FIG. 2 showing how multiple blocks configured with multiple records having the fields <Tag, Time, Value> may be merged together, for example, Block 0 and Block 1 from before the merging being merged together to form Block 0-1, while Block 2 and Block 3 being merged to form Block 2-3, and the merged Block 0-1 and Block 2-3 being configured to have the fields <Tag, Count, Time, Value, Row ID>, here, the “count” field representing the number of records having the same tag, for example, for Block 0 and Block 1 from before the merging, the numbers of records of which the tag ID is 0 being two and one, respectively, and accordingly, from the value of the “count” field, it being seen that the merged Block 0-1 has three records of which the tag ID is 0. Records sharing the same tag value are combined within Block 0-1 into an individual record having the tag value.); and 
reducing the set of data partitions by aggregating records of the second partition with records of a third partition [Block 2-3] and removing the second partition from the at least one node (see e.g., [0050] for Block 0-1 and Block 2-3, which have undergone a primary merging, being merged to generate Block 0-3 via a secondary merging, by repeating the merging steps in a leveled manner, including a primary merging process and a secondary merging process, there being the advantage that it is possible to generate one index file for a hundred million or more pieces of data, and also, when generating a second index file of a subsequent level from the partitioned first index files of a previous level is completed, then the completed status of the second index file being recorded in the head region and tail region of the second index file, and there being the advantage that the first index files can be deleted. The set of data partitions is reduced by aggregating records of Block 0-1 with records of Block 2-3 and removing Block 0-1.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould and Skjolsvold in view of Bellamkonda to combine, within the second partition, two or more records sharing the field value into an individual record having the field value; and reduce the set of data partitions by aggregating records of the second partition with records of a third partition and removing the second partition from the at least one worker node, as taught by Kim, for the benefit of reducing the computational cost of searching time series data (see e.g., Kim, [0078]-[0079]).

Claims 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Gould et al. (US Publication No. 2005/0102325) and Skjolsvold et al. (US Publication No. 2013/0204991) in view of Bellamkonda et al. (US Publication No. 2006/0116989) as applied to claims 1-4, 7-16, and 21-30 above, and further in view of Chhabra et al. (US Publication No. 2019/0229924).

As to claim 17, the limitations of parent claim 7 have been discussed above. Gould does not specifically disclose a memory allocated to track the number of data partitions. However, Skjolsvold teaches
a memory [partition table] allocated to track the number of data partitions (see e.g., [0030] for a partition table being used to track the current assignments of partitions to partition servers, when an active master or dictator assigns a partition to a server, the partition table being updated first to reflect the assignment, the partition table then being used to determine the partition server that will handle a client request based on the key specified in the client request, as an example, an entry in a partition table including the low key for a range, the high key for the range, and the role instance or server instance that will perform a requested task on the data or state corresponding to requested key, and a partition table also including other data, such as an epoch number or version number, [0039] for each partition having a current epoch number that is updated by the master when a change occurs for the partition, examples of changes for a partition including assignment of a partition to a new server, splitting of a partition, and merging of two partitions, assignments of a partition to a new server causing the epoch number to increase by one, splitting of a partition into two or more new partitions causing each child partition to receive the parent's epoch number incremented by one, and when two partitions are merged, the epoch number for the merged partition being the maximum epoch number for any of the partitions prior to merge incremented by one, [0042] for the information in partition table 455 being populated based on the partition decisions made by partition master 460, and [0063] for the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit. The partition table tracks the number of partitions by tracking the current assignments of partitions to partition servers. The partitions master obtains the number of partitions from the partition table in order to determine if the number is approaching an upper limit.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to include a memory allocated to track the number of data partitions, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).
Gould and Skjolsvold in view of Bellamkonda does not specifically disclose wherein the threshold value is set based on the memory allocated to track the number. However, Chhabra teaches
wherein the threshold value is set based on the memory allocated [counter] to track the number (see e.g., [0068] for it being important for the counter to be large enough to prevent an overflow, wherein the counter reaches its maximum value, in its foreseeable lifetime and [0081] for setting a threshold that indicates when the counter is getting close to entering the overflow state. A threshold is set based on the size of the counter.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould and Skjolsvold in view of Bellamkonda wherein the threshold value is set based on the memory allocated to track the number, as taught by Chhabra, for the benefit of reducing data processing overhead by reducing counter overflow (see e.g., Chhabra, [0078] and [0081]).

As to claim 18, the limitations of parent claim 1 have been discussed above. Gould teaches
wherein the memory allocated [number of bytes] is determined from a data type of a variable allocated (see e.g., [0114] for referring to FIG. 4, a type object 502 being, for example, a base type 504 or a compound type 506, a base type object 504 specifying how to interpret a string of bits (of a given length) as a single value, the base type object 504 including a length specification indicating the number of raw data bits to be read and parsed, and a length specification indicating a fixed length, such as a specified number of bytes. The number of bytes allocated is determined from a data type of a variable.).
Gould does not specifically disclose a memory allocated to track the number of data partitions. However, Skjolsvold teaches
a memory [partition table] allocated to track the number of data partitions (see e.g., [0030] for a partition table being used to track the current assignments of partitions to partition servers, when an active master or dictator assigns a partition to a server, the partition table being updated first to reflect the assignment, the partition table then being used to determine the partition server that will handle a client request based on the key specified in the client request, as an example, an entry in a partition table including the low key for a range, the high key for the range, and the role instance or server instance that will perform a requested task on the data or state corresponding to requested key, and a partition table also including other data, such as an epoch number or version number, [0039] for each partition having a current epoch number that is updated by the master when a change occurs for the partition, examples of changes for a partition including assignment of a partition to a new server, splitting of a partition, and merging of two partitions, assignments of a partition to a new server causing the epoch number to increase by one, splitting of a partition into two or more new partitions causing each child partition to receive the parent's epoch number incremented by one, and when two partitions are merged, the epoch number for the merged partition being the maximum epoch number for any of the partitions prior to merge incremented by one, [0042] for the information in partition table 455 being populated based on the partition decisions made by partition master 460, and [0063] for the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit. The partition table tracks the number of partitions by tracking the current assignments of partitions to partition servers. The partitions master obtains the number of partitions from the partition table in order to determine if the number is approaching an upper limit.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to include a memory allocated to track the number of data partitions, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).
Gould and Skjolsvold in view of Bellamkonda does not specifically disclose wherein the threshold value is set based on the memory allocated to track the number; and a variable allocated to track the number. However, Chhabra teaches
wherein the threshold value is set based on the memory allocated [counter] to track the number (see e.g., [0068] for it being important for the counter to be large enough to prevent an overflow, wherein the counter reaches its maximum value, in its foreseeable lifetime and [0081] for setting a threshold that indicates when the counter is getting close to entering the overflow state. A threshold is set based on the size of the counter.); and
a variable [counter] allocated to track the number (see e.g., [0056] for counters being incremented).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould and Skjolsvold in view of Bellamkonda wherein the threshold value is set based on the memory allocated to track the number; and to include a variable allocated to track the number, as taught by Chhabra, for the benefit of reducing data processing overhead by reducing counter overflow (see e.g., Chhabra, [0078] and [0081]).

As to claim 19, the limitations of parent claim 1 have been discussed above. Gould does not specifically disclose a memory allocated to track the number of data partitions. However, Skjolsvold teaches
a memory [partition table] allocated to track the number of data partitions (see e.g., [0030] for a partition table being used to track the current assignments of partitions to partition servers, when an active master or dictator assigns a partition to a server, the partition table being updated first to reflect the assignment, the partition table then being used to determine the partition server that will handle a client request based on the key specified in the client request, as an example, an entry in a partition table including the low key for a range, the high key for the range, and the role instance or server instance that will perform a requested task on the data or state corresponding to requested key, and a partition table also including other data, such as an epoch number or version number, [0039] for each partition having a current epoch number that is updated by the master when a change occurs for the partition, examples of changes for a partition including assignment of a partition to a new server, splitting of a partition, and merging of two partitions, assignments of a partition to a new server causing the epoch number to increase by one, splitting of a partition into two or more new partitions causing each child partition to receive the parent's epoch number incremented by one, and when two partitions are merged, the epoch number for the merged partition being the maximum epoch number for any of the partitions prior to merge incremented by one, [0042] for the information in partition table 455 being populated based on the partition decisions made by partition master 460, and [0063] for the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit. The partition table tracks the number of partitions by tracking the current assignments of partitions to partition servers. The partitions master obtains the number of partitions from the partition table in order to determine if the number is approaching an upper limit.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to include a memory allocated to track the number of data partitions, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).
Gould and Skjolsvold in view of Bellamkonda does not specifically disclose wherein the threshold value is set based on the memory allocated to track the number; and wherein the threshold value is set to avoid an overflow error in the memory when the number satisfies the threshold value. However, Chhabra teaches
wherein the threshold value is set based on the memory allocated [counter] to track the number (see e.g., [0068] for it being important for the counter to be large enough to prevent an overflow, wherein the counter reaches its maximum value, in its foreseeable lifetime and [0081] for setting a threshold that indicates when the counter is getting close to entering the overflow state. A threshold is set based on the size of the counter.); and
wherein the threshold value is set to avoid an overflow error in the memory when the number satisfies the threshold value (see e.g., [0081] for setting a threshold that indicates when the counter is getting close to entering the overflow state, for example, if the counter value plus the threshold exceeds the maximum value of the counter, then remapping of the counter being necessary, if a counter is determined to be close to overflow (e.g., based on the threshold value) adaptive mapping promoting the data line to a larger counter, and promotion including, for example, searching counters with the next larger size to find the next-larger sized counter with the smallest value. The threshold is set to ovoid an overflow error in the counter.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould and Skjolsvold in view of Bellamkonda wherein the threshold value is set based on the memory allocated to track the number; and wherein the threshold value is set to avoid an overflow error in the memory when the number satisfies the threshold value, as taught by Chhabra, for the benefit of reducing data processing overhead by reducing counter overflow (see e.g., Chhabra, [0078] and [0081]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 

Bulkowski et al. (US Publication No. 2018/0004777) for “[a]n optimization that the DBMS can implement is to elect the partition version with the highest number of records as the acting master for this partition” (see [0076]).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to DARA J GLASSER whose telephone number is (571)270-3666. The examiner can normally be reached Monday-Thursday, 10:00am-2:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Apu Mofiz can be reached at (571)272-4080. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




01-15-2026
/DARA J GLASSER/Examiner, Art Unit 2161                                                                                                                                                                                                        

















/APU M MOFIZ/Supervisory Patent Examiner, Art Unit 2161
Read full office action
Prosecution Timeline

Apr 03, 2024
Application Filed
Jan 15, 2026
Non-Final Rejection — §103, §DP (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/344,645
Patent 12572554
SYSTEMS, METHODS, AND COMPUTER READABLE MEDIA FOR DATA AUGMENTATION
2y 5m to grant Granted Mar 10, 2026
17/507,723
Patent 12468669
TECHNIQUES FOR BUILDING AND VALIDATING DATABASE SOFTWARE IN A SHARED MANAGEMENT ENVIRONMENT
2y 5m to grant Granted Nov 11, 2025
17/476,403
Patent 12443588
METHODS AND SYSTEMS FOR TRANSACTIONAL SCHEMA CHANGES
2y 5m to grant Granted Oct 14, 2025
17/358,972
Patent 12298993
METADATA SYNCHRONIZATION FOR CROSS SYSTEM DATA CURATION
2y 5m to grant Granted May 13, 2025
17/340,219
Patent 12271425
CONDENSING HIERARCHIES IN A GOVERNANCE SYSTEM BASED ON USAGE
2y 5m to grant Granted Apr 08, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
58%
Grant Probability
99%
With Interview (+53.9%)
3y 7m
Median Time to Grant
Low
PTA Risk
Based on 163 resolved cases by this examiner. Grant probability derived from career allow rate.