Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claim 21 is rejected under 35 U.S.C. 101 because the Applicant’s disclosure does not explicitly define the computer readable medium. Therefore, Examiner broadly construes the medium to include both tangible embodiments and intangible embodiments. As such, the claim is not limited to statutory subject matter and is therefore non-statutory.
NOTE: A claim drawn to such a computer readable medium that covers both transitory and non-transitory embodiments may be amended to narrow the claim to cover only statutory embodiments to avoid a rejection under 35 U.S.C. § 101 by adding the limitation “non-transitory” to the claim.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1 – 4, 8 – 13 and 17 – 22 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Powers et al. (US Pat. No. 10942716), hereinafter referred to as Powers.
As to claim 1, Powers discloses a heterogeneous acceleration device (heterogeneous group of processing units, 14A–14N, Fig. 1) comprising a first field programmable gate array (FPGA) (processing unit selectable including FPGA, 14N, Fig. 1, col. 3, line 52) and at least one second FPGA (additional specialized processing units, 14A–14N, Fig. 1), wherein the first FPGA is connected to an upper computer through a peripheral component interface express (PCIe) bus (runtime computing system communicating with host application executable, runtime computing system 12, Fig. 1) and is configured to receive first data transmitted by the upper computer (receiving application executable / platform-independent instructions, 28 / 22, Fig. 3) and return second data to the upper computer (executed computational task results returned through runtime system, processing units executing instructions, 14A–14N, Fig. 3);
the first data is data that needs to be accelerated by a heterogeneous acceleration device (converting the platform-independent instructions into platform-dependent instructions (e.g., platform-dependent instructions 39A and/or 39N) (86), Fig. 5) and the second data is data obtained after acceleration by the heterogeneous acceleration device (execution of platform-dependent instructions to perform computational task, 39A–39N, Fig. 3; Fig. 5);
the first FPGA is connected to the at least one second FPGA through a high-speed transmission device (communication channels between processing units, communication channels 50, Fig. 4) and is configured to transmit corresponding first data units to one or more second FPGAs among the at least one second FPGA (scheduler distributing instructions among processing units, scheduler 36 distributing platform-independent instructions 22, Fig. 3) and receive second data units returned by the one or more second FPGAs among the at least one second FPGA (processing units execute and return results to runtime system, processing units 14A–14N executing instructions, Fig. 3);
any one of the at least one second FPGA has at least one acceleration application (specialized hardware executing computational kernels, hardware backend modules 38A–38N executing kernels, Fig. 3);
the first data units are obtained by splitting the first data (determining one or more scheduling criteria that are associated with the platform-independent instructions 82, Fig. 5); a data type processed by each acceleration application corresponds to a data type of each first data unit (scheduler selecting processing unit based on scheduling criteria and hardware capability, scheduler 36 selecting processing unit, Fig. 3);
the second data units are data units obtained after corresponding acceleration applications performs heterogeneous acceleration on the first data units (selected processing unit executes platform-dependent instructions to perform computational task, 39A–39N, Fig. 5); and
the second data is obtained after merging the second data units (runtime system collecting execution outputs of processing units, runtime computing system 12 managing execution across processing units, Fig. 3).
As to claim 2, Powers discloses the heterogeneous acceleration device according to claim 1, characterized in that in response to the heterogeneous acceleration device enabling a virtualization acceleration function (the runtime computing system enables the virtualization acceleration function by presenting multiple heterogeneous processing units as one acceleration device to the upper computer; runtime computing system abstracting heterogeneous processing hardware from application executable, runtime computing system 12, Fig. 1; Fig. 3), the first FPGA in the heterogeneous acceleration device is configured to comprise: a PCIe hard core (the interface corresponds to the PCIe hard core that receives the first data from the upper computer and returns second data; host interface hardware receiving application executable, runtime computing system 12, Fig. 1), a virtual device read-write management module (the runtime system functions as the virtual device read write management module that obtains first data, splits first data units, merges second data units, and returns second data; runtime system managing execution and I/O of processing tasks, runtime computing system 12, Fig. 3), and at least one first communication module (the communication channels correspond to first communication modules that transmit first data units to second FPGAs and receive second data units; inter processor communication channels, communication channels 50, Fig. 4);
the second FPGA in the heterogeneous acceleration device is configured to comprise: a second communication module (communication channels correspond to second communication modules connected to first communication modules; communication channels between processing units, communication channels 50, Fig. 4), a second direct memory access control module (movement of data between modules corresponds to DMA obtaining first data units and providing them for processing; data movement between processing units and memory, processing units 14A–14N, Fig. 3), and at least one virtual acceleration application (hardware backend modules correspond to virtual acceleration applications that perform heterogeneous acceleration; hardware backend modules executing computational kernels, 38A–38N, Fig. 3); the virtual device read-write management module is configured to: obtain the first data from the PCIe hard core (obtaining first data from the upper computer through the interface; receiving application executable instructions, 28 / 22, Fig. 3), split the first data into corresponding first data units according to communication identifiers in the first data (dividing the first data into first data units based on identifiers used for routing; determining scheduling criteria for platform independent instructions, 82, Fig. 5), transmit the first data units to corresponding first communication modules (sending first data units to communication modules for routing; scheduler distributing instructions among processing units, scheduler 36, Fig. 3), obtain corresponding second data units from the first communication modules (receiving second data units returned after acceleration; processing units returning execution results, 14A–14N, Fig. 3), merge the second data units into corresponding second data (combining returned second data units into second data; runtime system collecting outputs across processing units, runtime computing system 12, Fig. 3), and transmit the corresponding second data to the upper computer (returning the second data to the upper computer; execution output returned to application executable, Fig. 3); each of the at least one first communication module transmits data to at least one second communication module (communication modules exchange routed data between FPGAs; inter processing communication channels, communication channels 50, Fig. 4); the first communication module transmits the first data units to corresponding second communication modules according to the communication identifier (routing first data units to selected second FPGAs based on identifiers; scheduler selecting processing unit based on criteria, scheduler 36, Fig. 3); and the second direct memory access control module obtains the first data units from the second communication modules, transmits the first data units to corresponding virtual acceleration applications according to application identifications in the first data units, receives the second data units returned by the virtual acceleration applications, and transmits the second data units to the corresponding second communication modules (second FPGA processes the first data units and returns second data units through the communication modules; selected processing unit executing platform dependent instructions and producing results, 39A–39N, Fig. 5).
As to claim 3, Powers discloses the heterogeneous acceleration device according to claim 2, characterized in that the virtual device read-write management module comprises: a read and split submodule (runtime scheduler distributing platform independent instructions, scheduler 36, Fig. 3) and a merge and write back submodule (runtime system aggregating execution results, runtime computing system 12, Fig. 3);
the read and split submodule splits the first data into the corresponding first data units according to the communication identifiers in the first data (scheduler 36, Fig. 3), and transmits the first data units to the corresponding first communication modules (determining scheduling criteria and distributing workload, 82, Fig. 5); and the merge and write back submodule merges the second data units into the corresponding second data and transmits the corresponding second data to the upper computer (collecting outputs and returning to application executable, runtime computing system 12, Fig. 3).
As to claim 4, Powers discloses the heterogeneous acceleration device according to claim 2, characterized in that the virtual device read-write management module further comprises a mapping table (association between task and selected processing unit, scheduler 36 scheduling criteria, Fig. 3); the mapping table comprises a correspondence relationship between the communication identifiers and the at least one first communication module (processing unit selection based on hardware capability criteria, scheduler 36, Fig. 3); and the first data units are transmitted to the corresponding first communication modules according to the mapping table (instruction routing to selected processing unit, scheduler 36, Fig. 3).
As to claim 8, Powers discloses the heterogeneous acceleration device according to claim 7, characterized in that in response to the second FPGA comprising one physical acceleration application (direct execution path between communication channel and processing unit, 38A–38N, Fig. 3), the fourth communication module is in communicative connection with the physical acceleration application (direct execution path between communication channel and processing unit, 38A–38N, Fig. 3); and the fourth communication module transmits the first data units to the physical acceleration application and receives the second data units from the physical acceleration application (processing units receive instructions and return results, 14A–14N, Fig. 3).
As to claim 9, Powers discloses the heterogeneous acceleration device according to claim 7, characterized in that in response to the second FPGA comprising a plurality of data acceleration applications, the second FPGA further comprises a split and merge management module (scheduler allocating workload among multiple processing units, scheduler 36, Fig. 3); the split and merge management module obtains the first data units from the fourth communication module, allocates the first data units to corresponding physical acceleration applications according to application identifications of the first data units (task routing and aggregation, scheduler 36, Fig. 3), obtains the second data units from the physical acceleration applications, and transmits the second data units to the fourth communication module (runtime computing system 12, Fig. 3).
As to claim 10, Powers discloses the heterogeneous acceleration device according to claim 7, characterized in that the storage read-write management module comprises: a read-out and split submodule (scheduler distributing instructions, scheduler 36, Fig. 3) and a merge and write-in submodule (runtime system aggregating outputs, runtime computing system 12, Fig. 3); the read-out and split submodule obtains the first data from the storage controller, splits the first data into the corresponding first data units according to the communication identifiers in the first data (determining scheduling criteria and distributing workload, 82, Fig. 5), and transmits the first data units to the corresponding third communication modules (scheduler 36, Fig. 3); and the merge and write-in submodule obtains the second data units from the third communication modules, merges the second data units into the corresponding second data, and transmits the corresponding second data to the storage controller (collecting outputs and returning data, runtime computing system 12, Fig. 3).
As to claim 11, Powers discloses the heterogeneous acceleration device according to claim 7, characterized in that the first FPGA further comprises a data acceleration application (processing unit executing computational kernel, 38A–38N, Fig. 3); and the data acceleration application obtains the first data units from the read-out and split submodule, performs heterogeneous acceleration on the first data units, and transmits the corresponding second data units to the merge and write-in submodule (execution and return of results, 39A–39N, Fig. 5).
As to claim 12, Powers discloses a heterogeneous acceleration system, characterized by comprising an upper computer (application executable generating computational tasks, 28, Fig. 3) and the heterogeneous acceleration device according to any one of claims 1 to 11 (runtime computing system 12 with processing units 14A–14N, Fig. 1), wherein the upper computer comprises: an application driver module and an application interface module (application executable interacting with runtime computing system, Fig. 3); the application driver module is configured to: configure registers of a virtual acceleration application, a physical acceleration application (direct execution path between communication channel and processing unit, 38A–38N, Fig. 3), and a data acceleration application, and control the virtual acceleration application, the physical acceleration application, and the data acceleration application through the application interface module (application executable controlling execution on processing units, Fig. 3).
As to claim 13, Powers discloses a heterogeneous acceleration method, characterized by being applied to the first FPGA in the heterogeneous acceleration system according to claim 12 and comprising: obtaining first data from the upper computer (receiving application executable instructions, 28 / 22, Fig. 3), wherein the first data is data that needs to be accelerated by the heterogeneous acceleration device (platform independent instructions representing computational tasks, Fig. 5); splitting the first data into first data units according to communication identifiers in the first data (scheduler 36, Fig. 3), and transmitting the first data units to corresponding second FPGAs (determining scheduling criteria and distributing workload, 82, Fig. 5); obtaining second data units from one or more second FPGAs among the at least one second FPGA, wherein the second data units are results obtained by processing the first data units through corresponding data acceleration applications (processing unit execution results, 14A–14N, Fig. 3); and merging the second data units into second data, and returning the second data to the upper computer (runtime computing system aggregating outputs and returning results, Fig. 3).
As to claim 17, Powers discloses a heterogeneous acceleration method, characterized by being applied to the second FPGA in the heterogeneous acceleration system according to claim12 and comprising: obtaining a first data unit transmitted by the first FPGA (processing unit receiving instructions, 14A–14N, Fig. 3), wherein the first data unit comprises an application identification; allocating the first data unit to a corresponding virtual acceleration application for acceleration processing according to the application identification, and obtaining a corresponding second data unit, wherein the second data unit is a result obtained by processing the first data unit through a corresponding data acceleration application (execution of platform dependent instructions, 39A–39N, Fig. 5); and transmitting the second data unit to the first FPGA through a corresponding second communication module (communication channels between processing units, communication channels 50, Fig. 4).
As to claim 19, Powers discloses the heterogeneous acceleration method according to claim 18, characterized in that before the activating a single root input output virtualization function of a driver program, the method further comprises: in response to a heterogeneous acceleration device enabling a virtualization acceleration function (runtime computing system abstracting heterogeneous processing hardware from application executable, runtime computing system 12, Fig. 1; Fig. 3), configuring the first FPGA in the heterogeneous acceleration device to comprise: a PCIe hard core, a virtual device read write management module, and at least one first communication module (runtime computing system interfacing and managing execution, runtime computing system 12, Fig. 1; Fig. 3); and configuring the second FPGA in the heterogeneous acceleration device to comprise: a second communication module, a second direct memory access control module, and at least one virtual acceleration application (processing units executing computational kernels, 14A–14N; 38A–38N, Fig. 3).
As to claim 20, Powers discloses a heterogeneous acceleration apparatus, characterized by comprising: an acceleration data obtaining module, configured to obtain first data from an upper computer (runtime computing system receiving application executable instructions, 12, Fig. 3), wherein the first data is data that needs to be accelerated by a heterogeneous acceleration device; a data splitting and transmission module, configured to: split the first data into first data units according to communication identifiers in the first data, and transmit the first data units to corresponding second FPGAs (determining scheduling criteria and distributing workload, 82, Fig. 5; scheduler 36, Fig. 3); a data acquisition module, configured to obtain second data units from one or more second FPGAs among at least one second FPGA, wherein the second data units are results obtained by processing the first data units through corresponding data acceleration applications (processing unit execution outputs, 14A–14N, Fig. 3); and a data merging module, configured to: merge the second data units into second data and return the second data to the upper computer (runtime computing system aggregating outputs, Fig. 3).
As to claim 21, Powers discloses one or more non-volatile computer-readable storage media, having a computer-readable instruction stored therein, characterized in that the computer-readable instruction, when executed by one or more processors, causes the one or more processors to perform the heterogeneous acceleration method according to any one of claims 13 to 19 (application executable instructions executed by processing units, 28 / 22, Fig. 3).
As to claim 22, Powers discloses a server, characterized in that when the server executes a computer-readable instruction, the server implements the heterogeneous acceleration method according to any one of claims 13 to 19 (computing platform executing runtime computing system with heterogeneous processing units, runtime computing system 12 with processing units 14A–14N, Fig. 1).
Allowable Subject Matter
Claims 5 – 7 and 14 – 16 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Karr et al. (US Pub. No. 20220334990) A method of applying a data format in a direct memory access transfer is provided. The method includes distributing user data throughout a plurality of storage nodes through erasure coding, wherein the plurality of storage nodes are housed within a single chassis that couples the storage nodes as a cluster.
Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JUANITO C BORROMEO whose telephone number is (571)270-1720. The examiner can normally be reached on Monday - Friday 9 - 5.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Henry Tsai can be reached on 5712724176. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J.C.B/ Assistant Examiner, Art Unit 2184
/HENRY TSAI/Supervisory Patent Examiner, Art Unit 2184