Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claims 1-20 are pending.
NOTE:
It is noted that any citations to specific, pages, columns, lines, or figures in the prior art reference and any interpretations of the reference should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. See MPEP 2123.
Information Disclosure Statement
The references cited in the information disclosure statement (IDS) submitted on 2/11/25 and 9/5/25 have been considered by the examiner.
Specification
The title of the invention is not descriptive. A new title is required that is clearly indicative of the invention to which the claims are directed. The title is meant to have an “informative value in indexing, classifying, searching”. See MPEP606.01.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1-20 are rejected under 35 U.S.C. 102(a)(1) (a)(2) as being anticipated by Kung et. al., U.S Patent Pub No. 2017/0315848 (hereinafter Kung).
Regarding Claim 1, Kung teaches a method for processing data, wherein the method is applied to a data processing system, wherein the data processing system comprises a control node and a plurality of computing nodes (Fig.4,5; Para43-44"When a client 405 submits a job and requests to nm the job on a cluster, the resource manager 401 launches the application master 403 on a first node 404 and hands this job to this application master 403"), and the method comprises: estimating, by the control node, a data volume of result data generated after a data processing task is executed(Fig.3, 4; Para41-42 "An input for each Reducer R_i 311 can be estimated using one or more of: the total nunber of Mappers (T_mapper) 305, the Mapper output of data size M_o 307, a total number of Reducers (T_reducer) 315, or a total reducer input data size (T_ri) 313");
obtaining, by the control node, memory information of a first computing node that is in the plurality of computing nodes and that executes a reduce task (Fig3. Para41-42 "R_comput 339 identifies a portion of the heap memory of the Reducer 222 that is used for computation, R_sib 341 identifies a portion of the heap memory of the Reducer 222 that is used as a shuffle input buffer, R_mbuffer 343 identifies a portion of R_sib 341 used to buffer Mapper 212 output, R_smt 345 is a threshold for initiating the merger of the Mapper 212 output, R_mib 347 is the Reducer-merged input buffer, and R_o 349 is output data size for the Reducer 222");
determining, by the control node, a quantity of reduce tasks based on the data volume and the memory information(Fig2, 3; Para.62-63 "Next the total number of Reducers 222,224, and 226, the total number of containers in one node, and the default cost for the Reducers all need to be derived. A total number of Reducers is given..");
executing, by a plurality of second computing nodes that are in the plurality of computing nodes and that execute the data processing task, the data processing task in parallel, wherein each of the plurality of second computing nodes partitions, based on the quantity of the reduce tasks, the result data generated after the data processing task is executed, and wherein each partition corresponds to one reduce task(Fig.2; Para5-8 "If the involved data sets are large, they are automatically partitioned across multiple nodes and the operations are applied in parallel", Para37-38 "Mapper-side shuffle: Incoming data 201 is divided into a plurality of splits, such as Split 0 202, Split 1 204, and Split 2 206." "A partition is performed before writing the data into disk to provide a first disk partition 231, a second disk partition 232, and a third disk partition 233. The total number of disk partitions is equal to the number of Reducers 222, 224, and 226."); and
performing, by the first computing node, reduce processing on data obtained after partitioning is performed by the plurality of second computing nodes (Fig.2-4;Para45-46 "An input for each of the Reducers 222, 224, and 226 can be estimated using the total number of Mappers 212, 214, and 216, mapper output, and the number of Reducers 222, 224, and 226").
Regarding claim 2, Kung teaches all the limitations of the base claims as outlined above.
Further, Kung teaches wherein the estimating, by the control node, a data volume of result data generated after a data processing task is executed comprises: obtaining historical data generated when a previously completed data processing task is executed, wherein the historical data comprises a data volume of result data generated by the previously completed data processing task; and estimating, based on the historical data, the data volume of the result data generated after the data processing task is executed (Fig.2;Para13-15 "The CM is configured for building and managing a catalog for historical jobs, data, and system resources. Statistics in the catalog are collected by a job profiler, a data profiler and a system profiler").
Regarding claim 3, Kung teaches all the limitations of the base claims as outlined above.
Further, Kung teaches wherein the estimating, by the control node, a data volume of result data generated after a data processing task is executed comprises: sampling, within a period of time after the plurality of second computing nodes start to execute the data processing task in parallel, result data generated by the plurality of second computing nodes by executing the data processing task; and estimating, based on sampled result data, the data volume of the result data generated after the data processing task is executed (Fig.2-4; Para37-38,110-111).
Regarding claim 4, Kung teaches all the limitations of the base claims as outlined above.
Further, Kung teaches wherein the estimating, by the control node, a data volume of result data generated after a data processing task is executed comprises: before the plurality of second computing nodes execute the data processing task, sampling to-be-processed data in the plurality of second computing nodes, and indicating the plurality of second computing nodes to process sampled to-be-processed data; and estimating, based on a processing result of the to-be-processed data, the data volume of the result data generated after the data processing task is executed(Fig.3, 4; Para41-42 "An input for each Reducer R_i 311 can be estimated using one or more of: the total nunber of Mappers (T_mapper) 305, the Mapper output of data size M_o 307, a total number of Reducers (T_reducer) 315, or a total reducer input data size (T_ri) 313").
Regarding claim 5, Kung teaches all the limitations of the base claims as outlined above.
Further, Kung teaches wherein the memory information is a memory size, and the determining, by the control node, a quantity of reduce tasks based on the data volume and the memory information comprises: dividing the data volume by the memory size; and rounding up to obtain the quantity of reduce tasks (Fig.2; Para5-8 "If the involved data sets are large, they are automatically partitioned across multiple nodes and the operations are applied in parallel", Para37-38 ).
Regarding claim 6, Kung teaches all the limitations of the base claims as outlined above.
Further, Kung teaches wherein a quantity of first computing nodes is equal to the quantity of reduce tasks, and one of the first computing nodes executes a respective one of the reduce tasks (Fig.2; Para5-8, Para37-38 "Mapper-side shuffle: Incoming data 201 is divided into a plurality of splits, such as Split 0 202, Split 1 204, and Split 2 206." "A partition is performed before writing the data into disk to provide a first disk partition 231, a second disk partition 232, and a third disk partition 233. The total number of disk partitions is equal to the number of Reducers 222, 224, and 226.").
Regarding claim 7, Kung teaches all the limitations of the base claims as outlined above.
Further, Kung teaches wherein a quantity of first computing nodes is less than the quantity of reduce tasks, and one of the first computing nodes executes a plurality of reduce tasks (Fig.2-4; Para37-38, 41-43).
Regarding claims 8-20, Kung teaches these claims according to the reasoning set forth in claim 1-7.
Conclusion
The prior art made of record , listed on form PTO-892, and not relied upon, if any, is considered pertinent to applicant's disclosure.
Huang et.al. US20190227853 teaches reduce task based on data volume.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TASNIMA MATIN whose telephone number is (571)272-8785. The examiner can normally be reached Monday-Friday 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jared Rutz can be reached on 571-272-5535. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
TASNIMA . MATIN
Primary Examiner
Art Unit 2135
/TASNIMA MATIN/Primary Examiner, Art Unit 2135