Prosecution Insights
Last updated: May 29, 2026
Application No. 18/984,852

DATA ACCESS IN A HETEROGENEOUS PROCESSING SYSTEM WITH MULTIPLE PROCESSORS

Non-Final OA §103
Filed
Dec 17, 2024
Priority
Jan 19, 2023 — continuation of 12/169,459
Examiner
AHMED, ZUBAIR
Art Unit
2132
Tech Center
2100 — Computer Architecture & Software
Assignee
Sambanova Systems Inc.
OA Round
2 (Non-Final)
68%
Grant Probability
Favorable
2-3
OA Rounds
1y 3m
Est. Remaining
72%
With Interview

Examiner Intelligence

Grants 68% — above average
68%
Career Allowance Rate
370 granted / 542 resolved
+13.3% vs TC avg
Minimal +4% lift
Without
With
+3.8%
Interview Lift
resolved cases with interview
Typical timeline
2y 8m
Avg Prosecution
21 currently pending
Career history
569
Total Applications
across all art units

Statute-Specific Performance

§101
0.7%
-39.3% vs TC avg
§103
90.1%
+50.1% vs TC avg
§102
5.6%
-34.4% vs TC avg
§112
1.3%
-38.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 542 resolved cases

Office Action

§103
DETAILED ACTION The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . This Office Action is responsive to amendment filed on 04/07/2026. Claims 1-20 have been examined and are pending in this application. Terminal Disclaimer The terminal disclaimer filed on 04/07/2026 disclaiming the terminal portion of any patent granted on this application which would extend beyond the expiration date of US Patent 12,169,459 has been reviewed and is accepted. The terminal disclaimer has been recorded. Allowable Subject Matter Claims 5-6, 8, and 17-20 are allowed. Claims 7, 9-10, and 12-16 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. Reasons for Allowance This communication warrants no examiner's reason for allowance, as the record makes evident the reason for allowance, satisfying the record "as a whole" as required by rule 37 CFR 1.104 (e). Accordingly, the reason for allowance is in all probability evident from the record and no statement for examiner's reason for allowance is necessary (see MPEP 1302.14). Response to Arguments Applicant's arguments filed 04/07/2026 with respect to claims 1-4 and 11 have been fully considered but they are not persuasive. Applicant argues, page 7 of the remarks, “A. ChoFleming Does Not Teach the Required Heterogeneous Processor Architecture”. Applicant further argues, page 8 of the remarks, “There is no teaching in ChoFleming of distinct heterogeneous processors - a reconfigurable first compute processor and a second co-processor - each having their own dedicated local memories, with a mechanism enabling the first processor to directly access the second processor's dedicated memory via mapped physical addresses as part of a compute pipeline. ChoFleming's processor elements share a common interconnected memory structure and do not disclose the distinct, separately-memorialized, cross-processor direct memory access architecture of claim 1.” The Examiner respectfully disagrees with the Applicant. First, the heterogeneous processing system as claimed. Claim 1 merely requires, in part, “wherein the heterogeneous processing system includes a host processor, a first processor coupled to a first memory, a second processor coupled to a second memory”. Claim 1 does not require anywhere “a reconfigurable first compute processor” and “a second co-processor” as argued by the Applicant above. In fact, claim 1 does not even require that the “first processor” is different from the “second processor” and the “first memory” is different from the “second memory”. Granted that claim 1 recites “heterogeneous processing system”, but the heterogeneity may be the result of the host processor. In fact, it is not clear to those of skilled in the art as to how the three processors that are claimed, constitute a heterogeneous processing system. ChoFleming teaches “Depicted accelerator tile 100 is a heterogeneous array comprised of several kinds of PEs [Processing Elements] coupled together via an interconnect network 104. Accelerator tile 100 may include one or more of integer arithmetic PEs, floating point arithmetic PEs, communication circuitry (e.g., network dataflow endpoint circuits), and in-fabric storage, e.g., as part of spatial array of processing elements 101.” Paragraph [0115] and FIG. 1. Second, Applicant argues above that in the instant claimed invention, there is “a mechanism enabling the first processor to directly access the second processor's dedicated memory via mapped physical addresses as part of a compute pipeline.” The word “pipeline” depicts, as known in the art, a completely different scenario inside the execution unit of a processor. However, the word “pipeline” is nowhere to be found in the instant claims, not even in the parent claims. It is noted that the other prior art reference Raikin teaches “a mechanism enabling the first processor to directly access the second processor's dedicated memory via mapped physical addresses” required by claim 1. Applicant argues, page 8 of the remarks, “B. Raikin Operates in a Fundamentally Different Technical Context”. Applicant further argues, page 8 of the remarks, “Raikin addresses peripheral device peer-to-peer memory access over a shared PCIe bus – a networking and I/O context entirely different distinct from the heterogeneous compute pipeline context of claim 1.” The Examiner respectfully disagrees. It appears that Applicant is mistakenly limiting the scope of the disclosure of Raikin to networking and I/O context. Raikin explicitly states “Such peripheral devices may include, for example, … various accelerator modules such as a graphics processing unit (GPU).” Paragraph [0003]. Accelerator modules and GPUs can be part of a compute pipeline context, as is notoriously well-known in the art. Furthermore, Raikin as a secondary reference is used to modify and augment the disclosure of ChoFleming. Applicant argues, page 8 of the remarks, “C. No Sufficient Motivation to Combine ChoFleming and Raikin”. Applicant further argues, page 9 of the remarks, “The Examiner's stated motivation to combine - that Raikin provides "improved methods and systems for accessing the local memory of a device over PCIe" (Raikin paragraph 0025) - is impermissibly generic under KSR Int'l Co. V. Teleflex Inc., 550 U.S. 398 (2007), which requires articulation of a specific reason why a POSITA would combine the particular teachings to arrive at the claimed invention.” The Examiner respectfully disagrees. It is not clear to the Examiner as to what Applicant means by “impermissibly generic”. This appears to be a nonce term. Raikin clearly improves the invention of ChoFleming. The translation agent (TA) of Raikin provides a mechanism of reading and writing pages between the system memory and a frame buffer memory in the GPU without involving the CPU or the operating system. See paragraph [0003] of Raikin. This automatic virtual address to physical address translation without the involvement of the CPU or the operating system improves the system of ChoFleming. In other words, the translation agent taught by Raikin improves the system of ChoFleming. As an example of this improvement, see FIG. 81 of ChoFleming where processor 8170 or processor 8180 may directly access the memory of processor 8115. Applicant argues, page 9 of the remarks, “A POSITA seeking to improve ChoFleming's compute pipeline would not turn to Raikin because: (1) ChoFleming's processor elements are not peripheral PCle devices of the type addressed by Raikin; (2) Raikin's multi-step TA-mediated translation protocol introduces latency incompatible with the low-latency direct memory access required in a high-throughput compute pipeline; and (3) the specific address mapping mechanism of claim 1 has no functional analog in either reference or their combination. Moreover, combining Raikin's ATS framework with ChoFleming's processor array would require fundamental redesign of both the dataflow execution model and the address translation infrastructure, well beyond ordinary skill in the art.” The Examiner respectfully disagrees. As already mentioned in the previous paragraphs of this Office Action, Raikin explicitly states “Such peripheral devices may include, for example, … various accelerator modules such as a graphics processing unit (GPU).” Paragraph [0003]. Accelerator modules and GPUs can be part of a compute pipeline context, as is notoriously well-known in the art. Further, the modification of ChoFleming by Raikin neither changes the principle of operation of ChoFleming nor renders the invention of ChoFleming inoperative. Therefore, a POSITA would be motivated to combine the two references. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1-4 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over ChoFleming et al. US 2020/0310994 (“ChoFleming”) in view of Raikin et al. US 2016/0077976 (“Raikin”). As per independent claim 1, ChoFleming teaches A method (A method comprising, see independent claim 9) for accessing data in a heterogeneous processing system using a dataflow graph having a plurality of nodes connected by edges (FIG. 3B illustrates a dataflow graph 300 for the program source of FIG. 3A … Dataflow graph 300 includes a pick node 304, switch node 306, and multiplication node 308. Para 0132), the method comprising: executing at least a portion of a first node of the plurality of nodes of the dataflow graph using the first processor (array of processing elements 301 is configured to execute the dataflow graph 300 of FIG. 3B, para 0133); executing at least a portion of a second node of the plurality of nodes of the dataflow graph using the second processor (array of processing elements 301 is configured to execute the dataflow graph 300 of FIG. 3B, para 0133). ChoFleming discloses all of the claim limitations from above and additionally teaches an array of processing elements connected to a memory, but does not explicitly teach “wherein the heterogeneous processing system includes a host processor, a first processor coupled to a first memory, a second processor coupled to a second memory, and switch and bus circuitry that communicatively couples the host processor, the first processor, and the second processor” and “mapping virtual addresses of the second memory to physical addresses of the switch and bus circuitry; and configuring the first processor to directly access the second memory using the mapped physical addresses; and directly accessing, by the first processor, the second memory through the switch and bus circuitry”. However, in an analogous art in the same field of endeavor, Raikin teaches wherein the heterogeneous processing system (FIG. 1 is a block diagram that schematically illustrates a computer system 20 having multiple diverse devices including CPU, GPU, SSD, and HCA communicating via PCIe, paras 0040-0042 and FIG. 1) includes a host processor (Computer system 20 comprises a CPU 32, para 0040 and FIG. 1), a first processor coupled to a first memory (Computer system 20 includes multiple Graphics Processing Units (GPUs), para 0042 and FIG. 1. Each GPU 44A-C in FIG. 1 comprises a local GPU memory 60, para 0047 and FIG. 1), a second processor coupled to a second memory (Computer system 20 includes multiple Graphics Processing Units (GPUs), para 0042 and FIG. 1. Each GPU 44A-C in FIG. 1 comprises a local GPU memory 60, para 0047 and FIG. 1), and switch and bus circuitry that communicatively couples the host processor, the first processor, and the second processor (CPU 32 and GPUs 44A-C are communicatively coupled via a switch fabric 40, para 0042 and FIG. 1. Communication over fabric 40 is carried out in accordance with a fabric address space referred to as physical address space or PCIe address space, para 0043 and FIG. 1), mapping virtual addresses of the second memory to physical addresses of the switch and bus circuitry (In system 400, TA 424 provides address translation services to DEV_A including converting DEV_A address space to PCIe address space, para 0110. A PCI BAR (Base Address Register) assigns a range of the PCIe address space to a respective address range of local memory 408 so that this address range can be accessed directly by one or more other devices such as DEV_A, para 0107. A PCIe device may use a virtual address space that is larger than a physical address space of fabric 40, para 0044); configuring the first processor to directly access the second memory using the mapped physical addresses (DEV_A is configured to directly access the local memories of multiple respective devices such as DEV_B, para 0114 and FIGS. 1 and 5-6, using the address translation services provided by TA 424, para 0110); directly accessing, by the first processor, the second memory through the switch and bus circuitry (DEV_A is configured to directly access the local memories of multiple respective devices such as DEV_B, para 0114 and FIGS. 1 and 5-6, using the address translation services provided by TA 424, para 0110). Given the teaching of Raikin, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to further modify the scope of the invention of ChoFleming with “wherein the heterogeneous processing system includes a host processor, a first processor coupled to a first memory, a second processor coupled to a second memory, and switch and bus circuitry that communicatively couples the host processor, the first processor, and the second processor” and “mapping virtual addresses of the second memory to physical addresses of the switch and bus circuitry; and configuring the first processor to directly access the second memory using the mapped physical addresses; and directly accessing, by the first processor, the second memory through the switch and bus circuitry”. The motivation would be that the invention provides improved methods and systems for accessing the local memory of a device over PCIe and other suitable bus or network fabric type, para 0025 of Raikin. As per dependent claim 2, ChoFleming in combination with Raikin discloses the method of claim 1. ChoFleming teaches wherein the method is for implementing a machine learning system using dataflow graphs (Certain embodiments herein permit the introduction of new application-specific PEs, for example, for machine learning or security, and not merely a homogeneous combination. Para 0153). As per dependent claim 3, ChoFleming in combination with Raikin discloses the method of claim 1. ChoFleming may not explicitly disclose, but Raikin teaches wherein configuring the first processor comprises configuring a reconfigurable dataflow unit (The CPU 32 configures mapping table 64 in the GPU 44A-C to translate between BAR1 addresses and respective E_REGION1 addresses, para 0066 and FIGS. 1 and 2A-B, so that other devices can directly access the local memory of another device, para 0114 and FIGS. 1 and 5-6). The same motivation that was utilized for combining ChoFleming and Raikin as set forth in claim 1 is equally applicable to claim 3. As per dependent claim 4, ChoFleming in combination with Raikin discloses the method of claim 1. ChoFleming may not explicitly disclose, but Raikin teaches wherein configuring the first processor comprises configuring a compute engine (The CPU 32 configures mapping table 64 in the GPU 44A-C to translate between BAR1 addresses and respective E_REGION1 addresses, para 0066 and FIGS. 1 and 2A-B, so that other devices can directly access the local memory of another device, para 0114 and FIGS. 1 and 5-6). The same motivation that was utilized for combining ChoFleming and Raikin as set forth in claim 1 is equally applicable to claim 4. As per dependent claim 11, ChoFleming in combination with Raikin discloses the method of claim 1. ChoFleming may not explicitly disclose, but Raikin teaches wherein the heterogeneous system includes mapping the virtual addresses of the second memory to the physical addresses of the switch and bus circuitry; and wherein the first processor is configured to directly access the second memory using the mapped physical addresses (The CPU 32 configures mapping table 64 in the GPU 44A-C to translate between BAR1 addresses and respective E_REGION1 addresses, para 0066 and FIGS. 1 and 2A-B, so that other devices can directly access the local memory of another device, para 0114 and FIGS. 1 and 5-6). The same motivation that was utilized for combining ChoFleming and Raikin as set forth in claim 1 is equally applicable to claim 11. Conclusion Additional references were considered by the Examiner. The references are (1) Galli et al. US 2023/0100873 (“Galli”) and (2) Byers et al. US 2018/0183660 (“Byers”). These references were not applied in the art rejection. Galli teaches “FIGS. 4 and 5 are block diagrams of example computing systems that illustrate embodiments utilizing memory tagging to track memory modifications and to synchronize memory contents in a heterogeneous computing environment involving multiple devices (e.g., CPU, GPU, etc.). The memory tagging, tracking, and synchronizing embodiments shown in FIGS. 4 and 5 involve computation offloading from one device (e.g., CPU) to another device (e.g., GPU). Computation offload is often a good strategy for achieving higher performance or more efficient execution for portions of certain workloads. For example, highly parallel loops, or matrix multiplication in a machine learning workload can often be executed more efficiently on GPU devices. A portion of code that is offloaded to another device is referred to as an ‘offloaded function’ or a ‘kernel function.’ With the main applications running on a CPU accessing the CPU’s main memory, and a kernel function running on an accelerator (e.g., GPU) accessing the GPU's separate memory, synchronization is needed between the memories of the CPU and the GPU once the offloaded computation is finished.” Paragraph [0062]. Gali in combination with prior art of record Raikin renders obvious independent claim 1 and teaches some of the dependent claims. Byers teaches “In general, a heterogeneous computing environment refers to a device or set of devices that have multiple processor types. An example heterogeneous computing environment 100 is shown in FIG. 1. As shown, heterogeneous computing environment 100 may include any number of processors 102 and a memory 104 in communication therewith. During operation, processors 102 may execute portions of one or more applications 106 stored in memory 104. Example application portions may include, but are not limited to, threads, sub-routines, functions, container-based code, and the like. When executed by processor 102, these application portions may also read or write to shared data 108 in memory 104 that is accessible by the various processors 102.” Paragraph [0011]. Byers in combination with prior art of record Raikin renders obvious independent claim 1 and teaches some of the dependent claims. THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZUBAIR AHMED whose telephone number is (571)272-1655. The examiner can normally be reached 7:30AM - 5:00PM EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, HOSAIN T. ALAM can be reached at (571) 272-3978. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /ZUBAIR AHMED/Examiner, Art Unit 2132 /HOSAIN T ALAM/Supervisory Patent Examiner, Art Unit 2132
Read full office action

Prosecution Timeline

Dec 17, 2024
Application Filed
Jan 07, 2026
Non-Final Rejection mailed — §103
Apr 07, 2026
Response Filed
May 01, 2026
Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12638986
OPTIMIZING DIE UTILIZATION IN MULTI-META DIE BASED STORAGE DEVICES
2y 4m to grant Granted May 26, 2026
Patent 12639170
TECHNIQUE TO PERFORM INCREMENTAL HIBERNATE AND RESUME OF BARE METAL CLUSTERS
1y 10m to grant Granted May 26, 2026
Patent 12619560
Computer Memory Expansion Device and Method of Operation
1y 8m to grant Granted May 05, 2026
Patent 12608283
DATA CONNECTOR COMPONENT FOR IMPLEMENTING DATA REQUESTS
1y 10m to grant Granted Apr 21, 2026
Patent 12585590
BROADCAST ASYNCHRONOUS LOADS TO SHARED LOCAL MEMORY
3y 5m to grant Granted Mar 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

2-3
Expected OA Rounds
68%
Grant Probability
72%
With Interview (+3.8%)
2y 8m (~1y 3m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 542 resolved cases by this examiner. Grant probability derived from career allowance rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month