DETAILED ACTION
Claims 1-20 are pending.
Priority: 7/2/2025(Provisional)
Assignee: Intel
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claim(s) 1-14 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Where applicant acts as his or her own lexicographer to specifically define a term of a claim contrary to its ordinary meaning, the written description must clearly redefine the claim term and set forth the uncommon definition so as to put one reasonably skilled in the art on notice that the applicant intended to so redefine that claim term. Process Control Corp. v. HydReclaim Corp., 190 F.3d 1350, 1357, 52 USPQ2d 1029, 1033 (Fed. Cir. 1999). The term “memory” in claim 1-14 is used by the claim to mean “memory device,” while the accepted meaning is “memory array.” The term is indefinite because the specification does not clearly redefine the term. Memory is a semiconductor material for storing data, and does not contain controller elements. “Memory device” is utilized for examination purposes.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-5, 8-12, 15-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kim et al.(20240330665), and further in view of Nguyen et al.(20250384001).
As per claim 1, Kim discloses:
A computer memory for deep learning(Kim, [Fig. 2 – deep learning]), the computer memory(Kim, [0327 -- The on-chip memory (OCM) may include first to eighth L0 memories 120a to 120h and the shared memory 2000.], [Kim, [0341 -- FIG. 40 is an enlarged block diagram of the area A in FIG. 38. Referring to FIGS. 38 and 40, the shared memory 2000 may include a first L0 memory controller 122_1a, a second L0 memory controller 122_1b, a fifth L0 memory controller 122_1e, a sixth L0 memory controller 122_1f, first to eighth memory units 2100a to 2100h, and a global controller 2200]) comprising:
a plurality of memory banks(Kim, [0352 – Fig. 40, The first to eighth memory units 2100a to 2100h may each include at least one memory bank. The first memory unit 2100a may include at least one first memory bank 2110a.]);
a plurality of clock domain crossing (CDC) buffers(Kim, [0378 -- The operating clock frequency of the second path unit (P2) may not be synchronized with the operating clock frequency of the bank controller (Bc). In this case, a clock domain crossing (CDC) work may be required to synchronize the clocks between the bank controller (Bc) and the second path unit (P2).]), different ones of the plurality of CDC buffers communicatively coupled to different ones of the plurality of memory banks(Kim, [0362 -- FIG. 41 is a diagram provided to explain the first memory bank of FIG. 40 in detail. Although FIG. 41 illustrates the first memory bank 2110a, the other memory banks may also have the same structure as the first memory bank 2110a.]);
Although Kim discloses:
Memory banks, neural network for deep learning and CDC components;
Kim does not explicitly disclose the following, however Nguyen discloses:
and one or more bank selection modules configured to select a memory bank from the plurality of memory banks for a data transfer request(Nguyen, [0042 -- Generally, the SMC 142 functions as a memory controller that manages access to the shared memory 140 by handling address decoding, bank selection, and access arbitration between multiple requestors within the OCM subsystem 122], [0047 -- In aspects, each bank may contain a dedicated control circuitry that manages timing and access arbitration, allowing multiple concurrent operations from different requestors.]) for computation in a neural network(Nguyen, [0037 -- In various implementations, block 118 may represent specialized functional units within the SoC architecture 100, such as hardware accelerators, DSPs, or other application-specific processing elements. Generally, block 118 performs dedicated computational tasks that benefit from hardware specialization, such as encryption, video processing, or neural network inference]), wherein a CDC buffer communicatively coupled to the selected memory bank is configured to store the data transfer request before the data transfer request is transmitted to the selected memory bank(Nguyen, [0008 -- The SoC contains an on-chip memory (OCM) subsystem that is coupled to the AXI interconnect, where the OCM subsystem comprises memory banks, a Direct Memory Access (DMA) interconnect that is coupled directly with respective memories of the one or more processor cores, and a shared memory controller (SMC) that is coupled with the AXI interconnect, the memory banks, and the DMA interconnect.], [0042 -- Generally, the SMC 142 functions as a memory controller that manages access to the shared memory 140 by handling address decoding, bank selection, and access arbitration between multiple requestors within the OCM subsystem 122], [0082 -- As depicted, the DMA interconnect 146 includes 2-1 mux 312, Hop 0 314 with associated light-weight (LW) DMA compute engine 318 and connected memory (Mem 0) 316, Hop 1 320 with associated clock domain crossing (CDC) circuit 322 (CDC 322) and connected memory (Mem 1) 324, Hop N 330 with associated the LW DMA compute engine 334 and connected memory (Mem N) 332, and terminator 340.], [0086 -- The DMA engine in Hop 0, in response to receiving the DMA command, can then issue a direct memory read to the memory connected to Hop 0 314 and send a memory write request to the memory connected to Hop 1 320], [0092 -- while memory subsystems that operate in different clock domains connect through CDC circuits], [0108 -- Address translation and timing synchronization occur before accessing the memory bank.]).
Therefore it would have been obvious to a person of ordinary skill at the time of filing to incorporate the features of Nguyen into the system of Kim for the benefit of managing data transfers, data distribution that are more efficient(Nguyen, [0036]).
As per claim 2, the rejection of claim 1 is incorporated, in addition, Kim discloses:
wherein the memory banks are arranged in a plurality of bank groups,(Kim, [0352 -- The first to eighth memory units 2100a to 2100h may each include at least one memory bank. The first memory unit 2100a may include at least one first memory bank 2110a.]);
and the computer memory further comprises a group selection module configured to: receive the data transfer request(Kim, [0346 -- The global controller 2200 may control all of the first to eighth memory units 2100a to 2100h. Specifically, if each of the first to eighth memory units 2100a to 2100h logically operates in the global memory format (i.e., not logically operating in the L0 memory format), the global controller 2200 may control the first memory unit 2100a to eighth memory unit 2100h.]);
select a bank group from the plurality of bank groups, wherein the one or more bank selection modules comprise a bank selection module corresponding to the selected bank group(Kim, [0351 -- The global controller 2200 may connect the first to eighth memory units 2100a to 2100h to the global interconnection 6000 of FIG. 3]);
and transmit the data transfer request to the bank selection module(Kim, [0351 -- The first to eighth memory units 2100a to 2100h may exchange data with the off-chip memory 30 of FIG. 2 by the global controller 2200, or exchange data with each of the first to eighth L0 memories 120a to 120h.]).
As per claim 2, the rejection of claim 1 is incorporated, in addition, Kim discloses:
wherein the memory banks are arranged in a plurality of bank groups,(Kim, [0352 -- The first to eighth memory units 2100a to 2100h may each include at least one memory bank. The first memory unit 2100a may include at least one first memory bank 2110a.]);
and the computer memory further comprises a group selection module configured to: receive the data transfer request(Kim, [0346 -- The global controller 2200 may control all of the first to eighth memory units 2100a to 2100h. Specifically, if each of the first to eighth memory units 2100a to 2100h logically operates in the global memory format (i.e., not logically operating in the L0 memory format), the global controller 2200 may control the first memory unit 2100a to eighth memory unit 2100h.]);
select a bank group from the plurality of bank groups, wherein the one or more bank selection modules comprise a bank selection module corresponding to the selected bank group(Kim, [0351 -- The global controller 2200 may connect the first to eighth memory units 2100a to 2100h to the global interconnection 6000 of FIG. 3]);
and transmit the data transfer request to the bank selection module(Kim, [0351 -- The first to eighth memory units 2100a to 2100h may exchange data with the off-chip memory 30 of FIG. 2 by the global controller 2200, or exchange data with each of the first to eighth L0 memories 120a to 120h.]).
As per claim 3, the rejection of claim 2 is incorporated, in addition, Kim discloses:
wherein the group selection module is in a first clock domain, and the one or more bank selection modules are in a second clock domain that is slower than the first clock domain(Kim, [0377 -- The second path unit (P2) may configure async-path. The operating clock frequency of the second path unit (P2) may be the same as that of the global interconnection 6000. The second path unit (P2) may also operate at the same clock frequency as the operating clock frequency of the global interconnection 6000.]).
As per claim 4, the rejection of claim 1 is incorporated, in addition, Kim discloses:
further comprising data paths, wherein each CDC buffer is arranged between a different memory bank and a bank selection module along a different data path(Kim, [0378 -- The operating clock frequency of the second path unit (P2) may not be synchronized with the operating clock frequency of the bank controller (Bc). In this case, a clock domain crossing (CDC) work may be required to synchronize the clocks between the bank controller (Bc) and the second path unit (P2)]).
As per claim 5, the rejection of claim 1 is incorporated, in addition Kim discloses:
further comprising additional CDC buffers, wherein each memory bank is communicatively coupled to a different one of the CDC buffer and a different one of the additional CDC buffer(Kim, [0145 -- Additionally, the external interface may also be connected to the global interconnection 6000. The global interconnection 6000 may be a path through which data moves between at least one neural processor 1000, the shared memory 2000, the DMA 3000, the non-volatile memory controller 4000, the volatile memory controller 5000, the command processor 7000, and the external interface.]).
As per claim 8, Kim discloses:
An apparatus for deep learning(Kim, [Fig. 2 – deep learning]), the apparatus comprising:
one or more processing elements, each processing element configured to perform a deep learning operation(Kim, [0105 -- The host processor (H_pr) may also transmit a task to the neural core SoC 10 through commands. The host processor (H_pr) may be an entity that gives instructions for works, and may be a kind of host that instructs the neural core SoC 10. That is, the neural core SoC 10 may efficiently perform parallel computational works such as deep learning works according to the instructions of the host processor (H_pr).]);
and a memory(Kim, [0327 -- The on-chip memory (OCM) may include first to eighth L0 memories 120a to 120h and the shared memory 2000.], [Kim, [0341 -- FIG. 40 is an enlarged block diagram of the area A in FIG. 38. Referring to FIGS. 38 and 40, the shared memory 2000 may include a first L0 memory controller 122_1a, a second L0 memory controller 122_1b, a fifth L0 memory controller 122_1e, a sixth L0 memory controller 122_1f, first to eighth memory units 2100a to 2100h, and a global controller 2200]) comprising:
a plurality of memory banks(Kim, [0352 – Fig. 40, The first to eighth memory units 2100a to 2100h may each include at least one memory bank. The first memory unit 2100a may include at least one first memory bank 2110a.]), a plurality of clock domain crossing (CDC) buffers(Kim, [0378 -- The operating clock frequency of the second path unit (P2) may not be synchronized with the operating clock frequency of the bank controller (Bc). In this case, a clock domain crossing (CDC) work may be required to synchronize the clocks between the bank controller (Bc) and the second path unit (P2).]), different ones of the plurality of CDC buffers communicatively coupled to different ones of the plurality of memory banks(Kim, [0362 -- FIG. 41 is a diagram provided to explain the first memory bank of FIG. 40 in detail. Although FIG. 41 illustrates the first memory bank 2110a, the other memory banks may also have the same structure as the first memory bank 2110a.]),
Although Kim discloses:
Memory banks, neural network for deep learning and CDC components;
Kim does not explicitly disclose the following, however Nguyen discloses:
and one or more bank selection modules configured to select a memory bank from the plurality of memory banks for a data transfer request(Nguyen, [0042 -- Generally, the SMC 142 functions as a memory controller that manages access to the shared memory 140 by handling address decoding, bank selection, and access arbitration between multiple requestors within the OCM subsystem 122], [0047 -- In aspects, each bank may contain a dedicated control circuitry that manages timing and access arbitration, allowing multiple concurrent operations from different requestors.]) for computation in a neural network(Nguyen, [0037 -- In various implementations, block 118 may represent specialized functional units within the SoC architecture 100, such as hardware accelerators, DSPs, or other application-specific processing elements. Generally, block 118 performs dedicated computational tasks that benefit from hardware specialization, such as encryption, video processing, or neural network inference]), the data transfer request comprising data computed or to be used by the one or more processing elements,(Nguyen, [0025 -- As depicted, the SoC architecture 100 may include core clusters (e.g., core cluster A 110 and core cluster B 112), low-latency memory (e.g., low-latency memory A 114 and low-latency memory B 116), block 118, a dedicated memory 120, an OCM subsystem 122]);
wherein a CDC buffer communicatively coupled to the selected memory bank is configured to store the data transfer request before the data transfer request is transmitted to the selected memory bank(Nguyen, [0008 -- The SoC contains an on-chip memory (OCM) subsystem that is coupled to the AXI interconnect, where the OCM subsystem comprises memory banks, a Direct Memory Access (DMA) interconnect that is coupled directly with respective memories of the one or more processor cores, and a shared memory controller (SMC) that is coupled with the AXI interconnect, the memory banks, and the DMA interconnect.], [0042 -- Generally, the SMC 142 functions as a memory controller that manages access to the shared memory 140 by handling address decoding, bank selection, and access arbitration between multiple requestors within the OCM subsystem 122], [0082 -- As depicted, the DMA interconnect 146 includes 2-1 mux 312, Hop 0 314 with associated light-weight (LW) DMA compute engine 318 and connected memory (Mem 0) 316, Hop 1 320 with associated clock domain crossing (CDC) circuit 322 (CDC 322) and connected memory (Mem 1) 324, Hop N 330 with associated the LW DMA compute engine 334 and connected memory (Mem N) 332, and terminator 340.], [0086 -- The DMA engine in Hop 0, in response to receiving the DMA command, can then issue a direct memory read to the memory connected to Hop 0 314 and send a memory write request to the memory connected to Hop 1 320], [0092 -- while memory subsystems that operate in different clock domains connect through CDC circuits], [0108 -- Address translation and timing synchronization occur before accessing the memory bank.]).
Claim(s) 9-12 are apparatus claims that implement the system claims 2-5, respectively, and therefore the rejections for these claims cite identical respective mappings.
As per claim 15, Kim discloses:
A method for deep learning(Kim, [Fig. 2 – deep learning]), comprising:
receiving, by a memory from one or more processing elements, a data transfer request for computation in a neural network(Kim, [0157 -- Referring to FIG. 11, the host processor (H_pr) may transmit the control signals to the command processor 7000 through the host interface (HIO). The control signal may be a signal to instruct to perform each operation including a computational work, a data load/store work, etc.]);
selecting, by one or more bank selection modules in the memory, a memory bank from a plurality of memory banks in the memory(Nguyen, [0042 -- Generally, the SMC 142 functions as a memory controller that manages access to the shared memory 140 by handling address decoding, bank selection, and access arbitration between multiple requestors within the OCM subsystem 122], [0047 -- In aspects, each bank may contain a dedicated control circuitry that manages timing and access arbitration, allowing multiple concurrent operations from different requestors.]);
the memory comprising plurality of CDC buffers, each of which is communicatively coupled to a different memory bank(Kim, [0362 -- FIG. 41 is a diagram provided to explain the first memory bank of FIG. 40 in detail. Although FIG. 41 illustrates the first memory bank 2110a, the other memory banks may also have the same structure as the first memory bank 2110a.]);
Although Kim discloses:
Memory banks, neural network for deep learning and CDC components;
Kim does not explicitly disclose the following, however Nguyen discloses:
writing the data transfer request into a clock domain crossing (CDC) buffer communicatively coupled to the selected memory bank,(Nguyen, [0008 -- The SoC contains an on-chip memory (OCM) subsystem that is coupled to the AXI interconnect, where the OCM subsystem comprises memory banks, a Direct Memory Access (DMA) interconnect that is coupled directly with respective memories of the one or more processor cores, and a shared memory controller (SMC) that is coupled with the AXI interconnect, the memory banks, and the DMA interconnect.], [0042 -- Generally, the SMC 142 functions as a memory controller that manages access to the shared memory 140 by handling address decoding, bank selection, and access arbitration between multiple requestors within the OCM subsystem 122], [0082 -- As depicted, the DMA interconnect 146 includes 2-1 mux 312, Hop 0 314 with associated light-weight (LW) DMA compute engine 318 and connected memory (Mem 0) 316, Hop 1 320 with associated clock domain crossing (CDC) circuit 322 (CDC 322) and connected memory (Mem 1) 324, Hop N 330 with associated the LW DMA compute engine 334 and connected memory (Mem N) 332, and terminator 340.], [0086 -- The DMA engine in Hop 0, in response to receiving the DMA command, can then issue a direct memory read to the memory connected to Hop 0 314 and send a memory write request to the memory connected to Hop 1 320], [0092 -- while memory subsystems that operate in different clock domains connect through CDC circuits], [0108 -- Address translation and timing synchronization occur before accessing the memory bank.]);
and transmitting the data transfer request from the CDC buffer to the selected memory bank(Nguyen, [0086 -- The DMA engine in Hop 0, in response to receiving the DMA command, can then issue a direct memory read to the memory connected to Hop 0 314 and send a memory write request to the memory connected to Hop 1 320]).
Claim(s) 16-19 are method claims that implement the system claims 2-5, respectively, and therefore the rejections for these claims cite identical respective mappings.
Claim(s) 6, 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kim et al.(20240330665), in view of Nguyen et al.(20250384001), and further in view of Fleming et al.(20190205284)
As per claim 6, the rejection of claim 5 is incorporated, in addition, Kim does not explicitly disclose the following, however Fleming discloses:
wherein an additional CDC buffer communicatively coupled to the selected memory bank is configured to store a response to the data transfer request before the response is transmitted to the one or more bank selection modules(Fleming, [0184 -- tendencies specified, the address and completion buffer slot are sent off to the memory system by the scheduler (e.g., via memory command 1042). When the result returns to multiplexer 1040 (shown schematically), it is stored into the completion buffer slot it specifies (e.g., as it carried the target slot all along though the memory system). The completion buffer sends results back into local network (e.g., local network 1002, 1004, 1006, or 1008) in the order the addresses arrived.]).
Therefore it would have been obvious to a person of ordinary skill at the time of filing to incorporate the features of Fleming into the system of Kim for the benefit of providing with high performance and extreme energy efficiency, characteristics relevant to all forms of computing ranging from supercomputing and datacenter to the internet-of-things. (Fleming, [0436]).
Claim(s) 13 is/are method claim(s) that implement the system claim 6, respectively, and therefore the rejections for these claim(s) cite identical respective mappings.
Claim(s) 7, 14, 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kim et al.(20240330665), in view of Nguyen et al.(20250384001), and further in view of Ovsiannikov et al.(20190392287)
As per claim 7, the rejection of claim 1 is incorporated, in addition, Kim in view of Nguyen does not explicitly disclose the following, however Ovsiannikov discloses:
wherein a bank selection module comprises a demultiplexer(Ovsiannikov, [0475 -- FIG. 4AC illustrates local OFM connections between a tile and its local SRAM bank set. Tile 102 outputs finished or partial results to OFM delivery fabric, which transports that data to the local SRAM bank set as well as other SRAM bank sets elsewhere and makes that data available to SRAM banks BO through B3 via a demultiplexer 405.]).
Therefore it would have been obvious to a person of ordinary skill at the time of filing to incorporate the features of Osiannikov into the system of Kim for the benefit of reducing area and power by eliminating activation staging first input first output (FIFO) registers for connecting multiplexers to multi-port cache output directly and revising a cache read logic to fetch input feature maps (IFM) from the cache to the multiplexers directly in a correct order. Osiannikov, [0408]).
Claim(s) 14 is/are method claim(s) that implement the system claim 7, and therefore the rejections for these claim(s) cite identical respective mappings.
Claim(s) 20 is/are method claim(s) that implement the system claim 7, and therefore the rejections for these claim(s) cite identical respective mappings.
Examiner Notes
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Oh(20230315297) where the system has a memory controller receiving a data output command from a host, generating a set of clock signals for outputting data, and controlling latency of the clock signals. An input/output (I/O) circuit outputs data based on the signals having the controlled latency. The controller has a clock signal generator receiving the command from the host and generating a toggle signal for transmitting a data clock signal. A latency controller utilizes the toggle signal as a latency control signal when the command is input to the generator. A select circuit selects one of the control signal or a latency signal(Oh, abstract).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ARVIND TALUKDAR whose telephone number is (303)297-4475. The examiner can normally be reached M-F, 10 am-6pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hosain Alam can be reached at 571-272-3978. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
Arvind Talukdar
Primary Examiner
Art Unit 2132
/ARVIND TALUKDAR/Primary Examiner, Art Unit 2132