DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Other Ref: Lee (20220147470) – Accessing memory based on multi-protocol (Abstract).
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 12/8/2025 has been entered.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 7, 14, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Clark et al., US Patent Application Publication Number 20220114086 (herein “CLARK”) in view of Willhalm et al., US Patent Application Publication Number 20230022544 (herein “WILLHALM”) and further in view of Connor (US 20230176987)
Regarding claim 1, CLARK discloses a computer-implemented method for performing data transfer operations in a multiprocessor system (FIGs. 1 and 2, system 100 with Host compute device 105, host CPU 107, and device 130 which may include “compute circuitry 136 may be a GPU” [0023]), the method comprising:
accessing, by a network controller (FIGs. 1 and 2, root complex 120 may be a network controller), a first memory of a first processor via a first interconnect using a first address map ([0020]-[0022], “root complex 120 may be arranged to function as a type of peripheral component interface express (PCIe) root complex for CPU 107 and/or other elements of host computing device 105 to communicate with devices such as device 130 via use of PCIe-based communication protocols and communication links…. memory transaction logic 133 and IO transaction logic 135 may be included in logic and/or features of device 130 that serve a role in exposing or reclaiming portions of device memory 134 based on what amount of memory capacity is or is not needed by compute circuitry 136 or device 130. The exposed portions of device memory 134, for example, available for use in a pooled or shared system memory that is shared with host compute device 105's host system memory 110 and/or other with other device memory of other device(s) coupled with host compute device 105”. E.g. root complex 120 as network controller may access the exposed portions, as first memory, of device memory 134 using PCIe-based memory address map via first interconnect between root complex 120 and device 130);
accessing, by a second processor (FIGs. 1 and 2, host CPU 107), the first memory of the first processor via a second interconnect using a second address map ([0020]-[0022], “root complex 120 may be arranged to function as a type of peripheral component interface express (PCIe) root complex for CPU 107 and/or other elements of host computing device 105 to communicate with devices such as device 130 via use of PCIe-based communication protocols and communication links…. As shown in FIG. 1 and described more below, root complex 120 includes host-managed device memory (HDM) decoders 126 that may be programmed to facilitate a mapping of host to device physical addresses for use in system memory (e.g., pooled system memory)…. As described more below, memory transaction logic 133 and IO transaction logic 135 may be included in logic and/or features of device 130 that serve a role in exposing or reclaiming portions of device memory 134 based on what amount of memory capacity is or is not needed by compute circuitry 136 or device 130. The exposed portions of device memory 134, for example, available for use in a pooled or shared system memory that is shared with host compute device 105's host system memory 110 and/or other with other device memory of other device(s) coupled with host compute device 105”. E.g. Host compute device 105 may access the exposed portions, as first memory, of device memory 134 using HDM decoded memory address map. Second interconnect being the connection between host CPU 107 and root complex 120); and
accessing, by the second processor, a second memory of the first processor via a third interconnect, wherein the first interconnect is coupled to the third interconnect ([0020]-[0022], “root complex 120 may be arranged to function as a type of peripheral component interface express (PCIe) root complex for CPU 107 and/or other elements of host computing device 105 to communicate with devices such as device 130 via use of PCIe-based communication protocols and communication links…. As shown in FIG. 1 and described more below, root complex 120 includes host-managed device memory (HDM) decoders 126 that may be programmed to facilitate a mapping of host to device physical addresses for use in system memory (e.g., pooled system memory)…. As described more below, memory transaction logic 133 and IO transaction logic 135 may be included in logic and/or features of device 130 that serve a role in exposing or reclaiming portions of device memory 134 based on what amount of memory capacity is or is not needed by compute circuitry 136 or device 130. The exposed portions of device memory 134, for example, available for use in a pooled or shared system memory that is shared with host compute device 105's host system memory 110 and/or other with other device memory of other device(s) coupled with host compute device 105”. E.g. Host compute device 105 may access a different portion of the exposed portions, as second memory, of device memory 134 using HDM decoded memory address map. Third interconnect being the connection between host CPU 107 and root complex 120, and the third interconnect is connected to the first interconnect that’s between the root complex 120 and the device 130, through the root complex 120).
CLARK does not explicitly disclose wherein the second interconnect directly connects the second processor with the first processor, and wherein the third interconnect connects the second processor with the first processor via a switch.
WILLHALM discloses wherein the second interconnect directly connects the second processor with the first processor (FIG. 11, [0104] “As shown in FIG. 11, multiprocessor system 1100 is a point-to-point interconnect system, and includes a first processor 1170 and a second processor 1180 coupled via a point-to-point interconnect 1150.” Thus, a second interconnect point-to-point interconnect 1150 directly connects 1 processor 1170 to a 2nd processor 1180), and
wherein the third interconnect connects the second processor with the first processor via a switch (FIG. 11, [0107] “Processors 1170, 1180 each exchange information with a chipset 1190 via individual P-P interfaces 1152, 1154.” Thus, 2nd processor 1180 is connected to 1st processor 1170 via third interconnect (1 of 1152 or 1154) via a switch 1190).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify CLARK’s multi-processor system and method to further include WILLHALM’s point-to-point interconnect interface on PCIe bus, to take advantage of advances in point-to-point interconnects, Switch-based technology, and packetized protocol to deliver new levels of performance and features (see WILLHALM [0016]]).
Clark in view of Willhalm does not disclose, but Connor discloses
to support data transfers; switch to support one or more of configuration operations, control operations, register read operations, register write operations, or interrupt operations (eg., Fig. 3-3c – 0045 - FIG. 3 further shows how data corresponding to a packet 228 that is received by a NIC at a first node (A) but contains data that is to be written to a memory resource on a second node (B) is handled under NUMA platform architecture 300. ; 0050 - PCIe root complexes, architecture 300a includes PCIe interconnects 303a1, 303a2, 303b1, and 303b2, which are coupled to a many-to-many PCIe switch 301a. To help facilitate packet routing within the switch, many-to-many PCIe switch 301a includes address maps 311a1, 311a2, 311b1, and 311b2; 0039 - a NIC driver may provide access to registers on a NIC, provide a program interface to the NIC, etc. The NIC driver also facilitates handling and forwarding of data received via packets from the network to consumers of that data, such as a software application).
It would have been obvious to one of ordinary skill in the art prior to the filing date of the claimed invention to modify peripheral component interface express (PCIe) root complex for CPU and/or other elements of host computing device to communicate with devices including point-to-point interconnect among processors, as disclosed by Clark and Willhalm with Connor, providing the benefit of computer architectures and, more specifically but not exclusively relates to a methods, apparatus, and computer platforms and architectures employing many-to-many and many-to-one peripheral switches internally within a computer system (see Connor, 0002); Under a NUMA architecture, processors (and processor cores) are enabled to access different memory resources distributed across the platform (0036).
Regarding claim 7, CLARK discloses the computer-implemented method of claim 1, wherein: the first processor comprises a graphics processor (FIGs. 1 and 2, device 130 which may include “compute circuitry 136 may be a GPU” [0023]), and the second processor comprises a central processing unit (FIGs. 1 and 2, host CPU 107).
Regarding claim 14, the applicant is directed to the rejections to claim 1 set forth above, as they are rejected based on the same rationale.
Regarding claim 16, the applicant is directed to the rejections to claim 7 set forth above, as they are rejected based on the same rationale.
Claims 2-4 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over CLARK in view of WILLHALM, and Connor (cited above) and further in view of Pudipeddi et al., US Patent Application Publication Number 20090113139 (herein “PUDIPEDDI”).
Regarding claim 2, CLARK does not explicitly disclose the computer-implemented method of claim 1, wherein accessing, by the network controller, the first memory of the first processor via the first interconnect comprises: determining that at least a portion of the first memory is owned by the second processor; transmitting a snoop operation to the second processor; and receiving a response to the snoop operation from the second processor that includes data stored in the at least the portion of the first memory.
PUDIPEDDI discloses the computer-implemented method of claim 1, wherein accessing, by the network controller, the first memory of the first processor via the first interconnect comprises: determining that at least a portion of the first memory is owned by the second processor; transmitting a snoop operation to the second processor; and receiving a response to the snoop operation from the second processor that includes data stored in the at least the portion of the first memory ([0013], in a multi-node processor network, FIGs. 1 and 4 “four agents are present, namely agents A, B and C, which may correspond to processor nodes or other system agents. In addition, a home agent is present. The home agent may be a processor node or other system agent that is owner of a particular memory region of interest (i.e., the home agent may be coupled to a local portion of main memory including one or more lines of interest)…. As shown in FIG. 2, agent A desires to read data present in the memory associated with the home agent and accordingly sends a read data signal (RdData). At the same time, agent A sends snoop requests (SnpData) to the other system agents, namely agents B and C. As shown in FIG. 2, when the home agent receives the read data request, it will perform a prefetch of the data as well as lookup of a state of the requested line in its directory. If the directory state indicates that no agents are caching a copy of the line (i.e., the directory entry is in the I state) the home agent will immediately return the data as soon as it is ready to agent A with a DataC_E message (and change the directory state for the agent A to valid)”. E.g. agent A may be considered the network controller, and home agent may be considered the second processor, which is the owner of the data in the memory of agent A. The snoop is broadcasted by agent A to home agent, and home agent responds to the snoop operation with the data in the response).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify CLARK’s multi-processor system and method with WILLHALM’s point-to-point interconnect interface on PCIe bus, and Connor to further include PUDIPEDDI’s snoop request for accessing data, to reduce latencies in data accesses (see PUDIPEDDI [0001]-[0003]).
Regarding claim 3, CLARK does not explicitly disclose the computer-implemented method of claim 2, wherein accessing, by the network controller, the first memory of the first processor via the first interconnect further comprises: in response to receiving the response to the snoop operation, transmitting an acknowledgement to the network controller that the data stored in the at least the portion of the first memory is available.
PUDIPEDDI discloses the computer-implemented method of claim 2, wherein accessing, by the network controller, the first memory of the first processor via the first interconnect further comprises: in response to receiving the response to the snoop operation, transmitting an acknowledgement to the network controller that the data stored in the at least the portion of the first memory is available ([0013], in a multi-node processor network, FIGs. 1 and 4 “four agents are present, namely agents A, B and C, which may correspond to processor nodes or other system agents. In addition, a home agent is present. The home agent may be a processor node or other system agent that is owner of a particular memory region of interest (i.e., the home agent may be coupled to a local portion of main memory including one or more lines of interest)…. As shown in FIG. 2, agent A desires to read data present in the memory associated with the home agent and accordingly sends a read data signal (RdData). At the same time, agent A sends snoop requests (SnpData) to the other system agents, namely agents B and C. As shown in FIG. 2, when the home agent receives the read data request, it will perform a prefetch of the data as well as lookup of a state of the requested line in its directory. If the directory state indicates that no agents are caching a copy of the line (i.e., the directory entry is in the I state) the home agent will immediately return the data as soon as it is ready to agent A with a DataC_E message (and change the directory state for the agent A to valid)”. E.g. agent A may be considered the network controller, and home agent may be considered the second processor, which is the owner of the data in the memory of agent A. The snoop is broadcasted by agent A to home agent, and home agent responds to the snoop operation with the data in the response and acknowledging by changing the directory state for agent A to valid/available for agent A).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify CLARK’s multi-processor system and method with WILLHALM’s point-to-point interconnect interface on PCIe bus, and Connor to further include PUDIPEDDI’s snoop request for accessing data, to reduce latencies in data accesses (see PUDIPEDDI [0001]-[0003]).
Regarding claim 4, CLARK does not explicitly disclose the computer-implemented method of claim 3, wherein the second processor maintains ownership of the at least the portion of the first memory pending receiving a transaction complete acknowledgment associated with the first interconnect.
PUDIPEDDI discloses the computer-implemented method of claim 3, wherein the second processor maintains ownership of the at least the portion of the first memory pending receiving a transaction complete acknowledgment associated with the first interconnect ([0013], in a multi-node processor network, FIGs. 1 and 4 “four agents are present, namely agents A, B and C, which may correspond to processor nodes or other system agents. In addition, a home agent is present. The home agent may be a processor node or other system agent that is owner of a particular memory region of interest (i.e., the home agent may be coupled to a local portion of main memory including one or more lines of interest)…. As shown in FIG. 2, agent A desires to read data present in the memory associated with the home agent and accordingly sends a read data signal (RdData). At the same time, agent A sends snoop requests (SnpData) to the other system agents, namely agents B and C. As shown in FIG. 2, when the home agent receives the read data request, it will perform a prefetch of the data as well as lookup of a state of the requested line in its directory. If the directory state indicates that no agents are caching a copy of the line (i.e., the directory entry is in the I state) the home agent will immediately return the data as soon as it is ready to agent A with a DataC_E message (and change the directory state for the agent A to valid)”. E.g. agent A may be considered the network controller, and home agent may be considered the second processor, which is the owner of the data in the memory of agent A. The snoop is broadcasted by agent A to home agent, and home agent responds to the snoop operation with the data in the response, and maintains ownership /does not change ownership of the data until the snoop response is received by agent A and transaction is thus complete, completion acknowledgment/receipt of data is inherent in the snoop response that sends the data to agent A).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify CLARK’s multi-processor system and method with WILLHALM’s point-to-point interconnect interface on PCIe bus, and Connor to further include PUDIPEDDI’s snoop request for accessing data, to reduce latencies in data accesses (see PUDIPEDDI [0001]-[0003]).
Regarding claim 15, the applicant is directed to the rejections to claim 2 set forth above, as they are rejected based on the same rationale.
Claims 5 and 6 are rejected under 35 U.S.C. 103 as being unpatentable over CLARK in view of WILLHALM, and Connor (cited above) and further in view of Foley, US Patent Application Publication Number 20230273818 (herein “FOLEY”).
Regarding claim 5, CLARK discloses the computer-implemented method of claim 1, wherein accessing, by the network controller, the first memory of the first processor via the first interconnect comprises: performing a first write operation via the first interconnect that is directed to a first portion of the first memory; and performing a second write operation via the first interconnect that is directed to a second portion of the first memory ([0020]-[0022], “root complex 120 may be arranged to function as a type of peripheral component interface express (PCIe) root complex for CPU 107 and/or other elements of host computing device 105 to communicate with devices such as device 130 via use of PCIe-based communication protocols and communication links…. As described more below, memory transaction logic 133 and IO transaction logic 135 may be included in logic and/or features of device 130 that serve a role in exposing or reclaiming portions of device memory 134 based on what amount of memory capacity is or is not needed by compute circuitry 136 or device 130. The exposed portions of device memory 134, for example, available for use in a pooled or shared system memory that is shared with host compute device 105's host system memory 110 and/or other with other device memory of other device(s) coupled with host compute device 105”. [0043], “application(s) 108 may access the DPA addresses mapped to programmed HDM decoders 125 for the portion of device memory 134 that was exposed for use in system memory. In some examples, applications(s) 108 may route read/write requests through memory transaction link 113 and logic and/or features of host adaptor circuitry 132 such as MTL 133 may forward the read/write requests to MC 131 to access the exposed memory capacity of device memory 134”. E.g. root complex 120 as network controller may access the exposed portions, as first memory, of device memory 134 using PCIe-based memory address map via first interconnect between root complex 120 and device 130, to write different portions of the exposed memory of memory 134).
CLARK does not explicitly disclose wherein an order of processing via the first interconnect is maintained between the first write operation and the second write operation.
FOLEY discloses wherein an order of processing via the first interconnect is maintained between the first write operation and the second write operation ([0027], “The set of data accesses complies with the ordering information. A data hazard that was detected can be resolved. Since a data hazard can be based on memory access conflicts such as write-after-read, read-after-write, and write-after-write conflicts, loads and/or stores can be delayed. The delay can be accomplished by holding data for the load and/or store in buffers. The data held in buffers can then be committed after the data hazard detection and mitigation window has expired.” E.g. avoiding write-after-write conflicts by enforcing order of write operations by delaying some data write after detecting conflict).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify CLARK’s multi-processor system and method with WILLHALM’s point-to-point interconnect interface on PCIe bus, and Connor to further include FOLEY’s write operation order enforcement, to resolve data hazards such as write-after-write conflicts (see FOLEY [0027]).
Regarding claim 6, CLARK discloses the computer-implemented method of claim 5, wherein, when performing at least one of the first write operation or the second write operation, at least one of the first portion of the first memory or the second portion of the first memory is owned by the second processor ([0020]-[0022], “root complex 120 may be arranged to function as a type of peripheral component interface express (PCIe) root complex for CPU 107 and/or other elements of host computing device 105 to communicate with devices such as device 130 via use of PCIe-based communication protocols and communication links…. As shown in FIG. 1 and described more below, root complex 120 includes host-managed device memory (HDM) decoders 126 that may be programmed to facilitate a mapping of host to device physical addresses for use in system memory (e.g., pooled system memory)…. As described more below, memory transaction logic 133 and IO transaction logic 135 may be included in logic and/or features of device 130 that serve a role in exposing or reclaiming portions of device memory 134 based on what amount of memory capacity is or is not needed by compute circuitry 136 or device 130. The exposed portions of device memory 134, for example, available for use in a pooled or shared system memory that is shared with host compute device 105's host system memory 110 and/or other with other device memory of other device(s) coupled with host compute device 105”. E.g. Host compute device 105 may access the exposed portions, as first memory, of device memory 134 using HDM decoded memory address map, as host-managed host-visible memory, thus, the memory portions are owned by the host or the second processor).
Claims 8 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over CLARK in view of WILLHALM, and Connor (cited above) and further in view of Kida et al., US Patent Application Publication Number 20220103536 (herein “KIDA”), and further in view of Zhu et al., US Patent Application Publication Number 20210011853 (herein “ZHU”).
Regarding claim 8, CLARK and WILLHALM do not explicitly disclose the computer-implemented method of claim 1, wherein: the first processor comprises a first dielet that is coupled to the first interconnect and a second dielet that is coupled to the second interconnect, and the first dielet is coupled to the second dielet via an interconnect with a first throughput that is higher than a second throughput of the first interconnect.
KIDA discloses wherein: the first processor comprises a first dielet that is coupled to the first interconnect and a second dielet that is coupled to the second interconnect (FIGs. 6B and 6C, [0112]-[0117], “Each chiplet can be fabricated as separate semiconductor die and coupled with the substrate 680 via an interconnect structure 673. The interconnect structure 673 may be configured to route electrical signals between the various chiplets and logic within the substrate 680…. The package interconnect 683 may be coupled to a surface of the substrate 680 to route electrical signals to other electrical devices, such as a motherboard, other chipset, or multi-chip module.” E.g. chiplets may be each connected to substrate 680 and routed to 1 or more external interconnects), and
the first dielet is coupled to the second dielet via an interconnect (FIGs. 6B and 6C, [0112]-[0117], “a logic or I/O chiplet 674 and a memory chiplet 675 can be electrically coupled via a bridge 687 that is configured to route electrical signals between the logic or I/O chiplet 674 and a memory chiplet 675”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify CLARK’s multi-processor system and method with WILLHALM’s point-to-point interconnect interface on PCIe bus, and Connor to further include KIDA’s chiplet integrated interconnects, to allow for hybrid processor designs that can mix and match different technology IP blocks (see KIDA [0119]).
CLARK, WILLHALM, and KIDA do not explicitly teach an interconnect with a first throughput that is higher than a second throughput of the first interconnect.
ZHU discloses an interconnect with a first throughput that is higher than a second throughput of the first interconnect ([0080], “the high-speed links 440-443 support a communication throughput of 4 GB/s, 30 GB/s, 80 GB/s or higher, depending on the implementation. Various interconnect protocols may be used including, but not limited to, PCIe 4.0 or 5.0 and NVLink 2.0. However, the underlying principles of the invention are not limited to any particular communication protocol or throughput”.[0081] “two or more of the GPUs 410-413 are interconnected over high-speed links 444-445, which may be implemented using the same or different protocols/links than those used for high-speed links 440-443. Similarly, two or more of the multi-core processors 405-406 may be connected over high speed link 433 which may be symmetric multi-processor (SMP) buses operating at 20 GB/s, 30 GB/s, 120 GB/s or higher.” E.g. various speeds at various protocols for the interconnects are possible, thus, first throughput may be higher than the second throughput).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify CLARK’s multi-processor system and method with WILLHALM’s point-to-point interconnect interface on PCIe bus and Connor and KIDA’s chiplet integrated interconnects, to further include ZHU’s second PCIe interconnect, to allow for low latency, high capacity memory access (see ZHU Abstract).
Regarding claim 17, the applicant is directed to the rejections to claim 8 set forth above, as they are rejected based on the same rationale.
Claims 9-10 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over CLARK(cited above) in view of WILLHALM (cited above) and Connor (cited above) and further in view of ZHU.
Regarding claim 9, CLARK discloses the computer-implemented method of claim 1, wherein: the first interconnect is implemented usinq one of a Generation 5 Peripheral Component Interconnect Express (Gen5 PCIe®) interface protocol or a Generation 6 Peripheral Component Interconnect Express (Gen6 PCIe®) interface protocol and has a first throughput, and the third interconnect is implemented usinq one of the Gen5PCIe®interface protocol or the Gen6 PCIe® interface protocol and has a second throughput. ([0020]-[0022], “root complex 120 may be arranged to function as a type of peripheral component interface express (PCIe) root complex for CPU 107 and/or other elements of host computing device 105 to communicate with devices such as device 130 via use of PCIe-based communication protocols and communication links”. E.g. PCIe based interconnect between the root complex 120 / network controller and device 130 /first processor, aka first interconnect with first throughput).
CLARK does not explicitly disclose the third interconnect comprises a second PCIe interconnect with a second throughput.
ZHU discloses the third interconnect comprises a second PCIe interconnect with a second throughput (FIG. 1, [0037]-[0042], “The computing system 100 includes a processing subsystem 101 having one or more processor(s) 102 and a system memory 104 communicating via an interconnection path that may include a memory hub 105…. Communication paths interconnecting the various components in FIG. 1 may be implemented using any suitable protocols, such as PCI (Peripheral Component Interconnect) based protocols (e.g., PCI-Express), or any other bus or point-to-point communication interfaces and/or protocol(s), such as the NV-Link high-speed interconnect, or interconnect protocols known in the art.”. E.g. third interconnect between the CPU and the network controller/ memory hub 105 may be a PCIe interconnect).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify CLARK’s multi-processor system and method with WILLHALM’s point-to-point interconnect interface on PCIe bus, and Connor to further include ZHU’s second PCIe interconnect, to allow for low latency, high capacity memory access (see ZHU Abstract).
Regarding claim 10, CLARK does not explicitly disclose the computer-implemented method of claim 9, wherein the first throughput is higher than the second throughput.
ZHU discloses the computer-implemented method of claim 9, wherein the first throughput is higher than the second throughput ([0080], “the high-speed links 440-443 support a communication throughput of 4 GB/s, 30 GB/s, 80 GB/s or higher, depending on the implementation. Various interconnect protocols may be used including, but not limited to, PCIe 4.0 or 5.0 and NVLink 2.0. However, the underlying principles of the invention are not limited to any particular communication protocol or throughput”.[0081] “two or more of the GPUs 410-413 are interconnected over high-speed links 444-445, which may be implemented using the same or different protocols/links than those used for high-speed links 440-443. Similarly, two or more of the multi-core processors 405-406 may be connected over high speed link 433 which may be symmetric multi-processor (SMP) buses operating at 20 GB/s, 30 GB/s, 120 GB/s or higher.” E.g. various speeds at various protocols for the interconnects are possible, thus, first throughput may be higher than the second throughput).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify CLARK’s multi-processor system and method with WILLHALM’s point-to-point interconnect interface on PCIe bus, and Connor to further include ZHU’s second PCIe interconnect, to allow for low latency, high capacity memory access (see ZHU Abstract).
Regarding claim 18, the applicant is directed to the rejections to claim 9 set forth above, as they are rejected based on the same rationale.
Regarding claim 19, the applicant is directed to the rejections to claim 10 set forth above, as they are rejected based on the same rationale.
Claims 11-13 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over CLARK in view of WILLHALM, and Connor (cited above) and further in view of ZHU, in further view of KIDA.
Regarding claim 11, CLARK and ZHU do not explicitly disclose the computer-implemented method of claim 9, wherein the second interconnect comprises a chip-to-chip interconnect with a third throughput.
KIDA discloses the computer-implemented method of claim 9, wherein the second interconnect comprises a chip-to-chip interconnect with a third throughput (FIGs. 6B and 6C, [0112]-[0117], “The hardware logic chiplets can include special purpose hardware logic chiplets 672, logic or I/O chiplets 674, and/or memory chiplets 675. The hardware logic chiplets 672 and logic or I/O chiplets 674 may be implemented at least partly in configurable logic or fixed-functionality logic hardware and can include one or more portions of any of the processor core(s), graphics processor(s), parallel processors, or other accelerator devices described herein…. the bridge 687 may simply be a direct connection from one chiplet to another chiplet”. E.g. second interconnect between CPU and GPU may be chip-to-chip interconnect via bridge 687, with third throughput).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify CLARK’s multi-processor system and method with WILLHALM’s point-to-point interconnect interface on PCIe bus and ZHU’s second PCIe interconnect, and Connor to further include KIDA’s chiplet integrated interconnects, to allow for hybrid processor designs that can mix and match different technology IP blocks (see KIDA [0119]).
Regarding claim 12, CLARK does not explicitly disclose the computer-implemented method of claim 11, wherein the third throughput is higher than each of the first throughput and the second throughput.
ZHU discloses the computer-implemented method of claim 11, wherein the third throughput is higher than each of the first throughput and the second throughput ([0080], “the high-speed links 440-443 support a communication throughput of 4 GB/s, 30 GB/s, 80 GB/s or higher, depending on the implementation. Various interconnect protocols may be used including, but not limited to, PCIe 4.0 or 5.0 and NVLink 2.0. However, the underlying principles of the invention are not limited to any particular communication protocol or throughput”. E.g. various speeds at various protocols for the interconnects are possible, thus, third throughput may be higher than each of the first throughput and second throughput).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify CLARK’s multi-processor system and method with WILLHALM’s point-to-point interconnect interface on PCIe bus, and Connor to further include ZHU’s second PCIe interconnect, to allow for low latency, high capacity memory access (see ZHU Abstract).
Regarding claim 13, CLARK discloses the computer-implemented method of claim 11, wherein: the first address map comprises a base address register (BAR) address map associated with the first PCIe interconnect, and the second address map comprises a host-managed device memory (HDM) address map associated with the chip-to-chip interconnect ([0020]-[0022], “root complex 120 may be arranged to function as a type of peripheral component interface express (PCIe) root complex for CPU 107 and/or other elements of host computing device 105 to communicate with devices such as device 130 via use of PCIe-based communication protocols and communication links…. As shown in FIG. 1 and described more below, root complex 120 includes host-managed device memory (HDM) decoders 126 that may be programmed to facilitate a mapping of host to device physical addresses for use in system memory (e.g., pooled system memory)…. As described more below, memory transaction logic 133 and IO transaction logic 135 may be included in logic and/or features of device 130 that serve a role in exposing or reclaiming portions of device memory 134 based on what amount of memory capacity is or is not needed by compute circuitry 136 or device 130. The exposed portions of device memory 134, for example, available for use in a pooled or shared system memory that is shared with host compute device 105's host system memory 110 and/or other with other device memory of other device(s) coupled with host compute device 105”. E.g. root complex 120 as network controller may access the exposed portions, as first memory, of device memory 134 using PCIe-based memory address map (BAR address map is what PCIe uses) via first interconnect between root complex 120 and device 130. Host compute device 105 may access the exposed portions, as first memory, of device memory 134 using HDM decoded memory address map. Second interconnect being the connection between host CPU 107 and root complex 120).
Regarding claim 20, the applicant is directed to the rejections to claim 11 set forth above, as they are rejected based on the same rationale.
Response to Arguments
Applicant's arguments filed 12/8/2025 have been fully considered but they are not persuasive.
For claims 1 and 14, Applicant argues that that the cited references do not disclose the amended limitations. The Office disagrees.
In the present OA, the updated combination of references render the amended limitations as obvious.
Specifically, Clark in view of Willhalm does not disclose, but Connor discloses
to support data transfers; switch to support one or more of configuration operations, control operations, register read operations, register write operations, or interrupt operations (eg., Fig. 3-3c – 0045 - FIG. 3 further shows how data corresponding to a packet 228 that is received by a NIC at a first node (A) but contains data that is to be written to a memory resource on a second node (B) is handled under NUMA platform architecture 300. ; 0050 - PCIe root complexes, architecture 300a includes PCIe interconnects 303a1, 303a2, 303b1, and 303b2, which are coupled to a many-to-many PCIe switch 301a. To help facilitate packet routing within the switch, many-to-many PCIe switch 301a includes address maps 311a1, 311a2, 311b1, and 311b2
It would have been obvious to one of ordinary skill in the art prior to the filing date of the claimed invention to modify peripheral component interface express (PCIe) root complex for CPU and/or other elements of host computing device to communicate with devices including point-to-point interconnect among processors, as disclosed by Clark and Willhalm with Connor, providing the benefit of computer architectures and, more specifically but not exclusively relates to a methods, apparatus, and computer platforms and architectures employing many-to-many and many-to-one peripheral switches internally within a computer system (see Connor, 0002); Under a NUMA architecture, processors (and processor cores) are enabled to access different memory resources distributed across the platform (0036).
Applicant’s arguments for dependent claims 2-13 and 15-20 are based on their respective base independent claims 1 and 14, which are addressed above.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GAUTAM SAIN whose telephone number is (571)270-3555. The examiner can normally be reached M-F 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jared Rutz can be reached at 571-272-5535. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/GAUTAM SAIN/Primary Examiner, Art Unit 2135