DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on August 22, 2025 was filed after the filing date of the application on February 6, 2024. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Drawings
The drawings were received on February 6, 2024. These drawings are accepted.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claim 20 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter. Claim 20 recites, “a computer-readable medium,” where the specification discloses “computer-readable medium” or “computer-readable media” may include computer data storage media or communication media including any medium that facilitates transfer of a computer program from one place to another…computer-readable media generally may correspond to: (1) tangible computer-readable storage media, which is non-transitory; or (2) a communication medium such as a signal or carrier wave…” (paragraph [0105]), however, “computer-readable mediums” as “communication mediums” are non-statutory.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-9, 13, and 16-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Murphy (US 2024/0411607).
As to claim 1, Murphy discloses an apparatus for graphics processing (Figure 1, testing and optimization system 100, further implemented in system of at least Figure 3), comprising: a memory (resources 110 as data buffers); and a processor coupled to the memory (e.g. graphics processing unit (GPU) 108 coupled to resources 110) and, based on information stored in the memory (e.g. based on resources 110), the processor is configured to (Figure 5): obtain an indication of a set of graphics processor workloads that are to be executed by a graphics processor in an execution order (step 502 notes intercepting a sequence of API calls, which corresponds to a first plurality of compute tasks to be performed to render a first image, where [0023] notes a content application 102 makes a series of calls to at least one interface, such as an application programming interface (API) 106 of a renderer 104, such as may include a rendering application executing on a graphics processing unit (GPU) 108, where a series of API calls can be sent in a determined sequence to cause a corresponding sequence of tasks to be performed in order to perform a specific operation, such as to perform a sequence of individual tasks needed to render an image or video frame, among other such types of content); determine that a first subset of graphics processor workloads in the set of graphics processor workloads produces a set of resources in graphics memory and that a second subset of graphics processor workloads in the set of graphics processor workloads consumes the set of resources in the graphics memory (step 504 notes generating a first resource dependency graph based on dependencies determined between the first plurality of tasks, where [0023] notes the sequence of tasks to be performed can be determined based at least in part on the dependencies between various tasks, such as where the output of a first task is required to be provided as input to a second task, such that the second task cannot be executed properly until after the first task has completed; step 506 notes identifying at least one potential optimization of the resource dependency graph that maintains the dependencies, where [0024] notes interceptor 112 intercepts the stream of API calls used to render the initial image 126 and causes a copy of the API calls to be transmitted to a resource usage tracker 114, which analyzes the sequence of calls to determine information such as the respective timing of calls, as well as the resources associated with the individual calls, the resource usage tracker 114 then passes the information to an optimization determination and testing application, such as a frame interceptor application (FI) 116, and [0025] further notes FI 116 takes the information about the sequence of API calls and associated resources and generates a resource dependency graph (RDG) 136 using an RDG builder module 118 or process, and once an RDG is generated for the image or frame, an optimization analyzer 120 analyzes the RDG to attempt to determine any and all possible optimizations); alter, based on the determination, the execution order of the set of graphics processor workloads such that a first index of the first subset of graphics processor workloads occurs within a threshold index separation from a second index of the second subset of graphics processor workloads (step 508 notes generating, based at least in part upon the at least one potential optimizations, a second dependency graph including a second plurality of compute tasks, where [0026] notes the optimization analyzer 120 determines one or more potential optimizations, and provides information for these optimizations to an RDG reordering module 122, which can generate a new, updated, or optimized RDG for the image to be rendered that includes at least one of the identified optimizations, but respects the dependencies between tasks associated with nodes of the graph; step 510 notes causing a second image to be rendered using API calls issued according to the second resource dependency graph, where [0026] further notes once a new (or optimized) RDG is generated, a frame reconstruction module 124 of the FI 116 submits the sequence of API calls according to the new RDG, or works with content application 102 to cause a new sequence of API calls to be issued according to the new RDG, among other such options, the renderer 104 executing on GPU 108 (or set of GPUs) can then perform the tasks in the new sequence according to the second RDG, and can generate a second or “optimized” image 128, which should contain substantially identical image content to the initial image 126; step 512 notes determining whether the second image is equivalent to the first image but took less time to render, where [0027] further notes testing of the new reordered RDG, where a comparator 130 of the FI 116 may compare the initial image 126 and the “optimized” image 128 to 1) ensure that the images contain the same or equivalent image content and 2) determine whether the reordered RDG resulted in at least a minimum or threshold level of improvement, whether in rendering time, resource usage, or another such metric); and output an indication of the altered execution order (step 514 notes determining if the process was optimized, e.g. the initial image 126 and “optimized” image 128 are substantially equivalent and/or the reordered RDG resulted in at least a minimum or threshold level of improvement; and, if so, step 516, providing information about the optimization as a recommendation, where [0027] further notes an optimization recommendation 134 may be generated and provided to a developer if the improvement at least meets a minimum threshold or satisfies a specific improvement criterion).
As noted above, Murphy discloses determining dependencies of between various tasks information, such as where the output of a first task is required (e.g. considered resources produced) to be provided as input to a second task (e.g. considered resources consumed), where resource usage tracker further determines information regarding the resources 110 associated with the draw calls (e.g. tasks), the resources 110 illustrated as part of GPU 108 (considered resources of graphics memory). Therefore, these determinations collectively are considered to “determine that a first subset of graphics processor workloads in the set of graphics processor workloads produces a set of resources in graphics memory and that a second subset of graphics processor workloads in the set of graphics processor workloads consumes the set of resources in the graphics memory” as claimed, yielding predictable results, without changing the scope of the invention.
As to claim 2, Murphy discloses to alter the execution order of the set of graphics processor workloads, the processor is configured to alter the execution order such that an execution of the first subset of graphics processor workloads is prior to an execution of the second subset of graphics processor workloads ([0025] notes optimizations may include reordering tasks, combining tasks, deleting unnecessary tasks or otherwise modifying the timing or positioning of nodes for those tasks in the RDG to be more efficient, while ensuring that the determined dependencies are respected in the optimized RDG, e.g. this may include a second set of tasks that includes the same tasks but in a different ordering or sequence, or may include at least some different tasks, such as where tasks may have been combined or redundant tasks removed, among other such options).
As to claim 3, Murphy discloses the processor is further configured to: generate a graphics processor workload dependency graph based on the determination that the first subset of graphics processor workloads produces the set of resources and that the second subset of graphics processor workloads consumes the set of resources ([0025] notes generating a resource dependency graph (RDG) based on information about the sequence of API calls and associated resources, e.g. information from resource usage tracker 114), and wherein to alter the execution order of the set of graphics processor workloads, the processor is configured to alter the execution order of the set of graphics processor workloads further based on the graphics processor workload dependency graph such that dependencies indicated by the dependency graph are not violated ([0025] notes once an RDG is generated for the image or frame, the optimization analyzer 120 analyzes the RDG to attempt to determine any and all possible optimizations, where these optimizations may include reordering tasks, combining tasks, deleting unnecessary tasks or otherwise modifying the timing or positioning of nodes for those tasks in the RDG to be more efficient, while ensuring that the determined dependencies are respected in the optimized RDG).
As to claim 4, Murphy discloses the altered execution order results in a chain of commands associated with outputs that are produced in the graphics memory or consumed entirely in the graphics memory ([0023] notes the sequence of rendering tasks is ex3ecuted on the GPU 108 using various resources 110, such as data buffers, to produce an initial image 126, where it is obvious that the resources would further be used to produce the “optimized” image 128, but, e.g. using less resources, [0026]).
As to claim 5, Murphy discloses to output the indication of the altered execution order, the processor is configured to: store the indication of the altered execution order in at least one of the memory, a buffer, or a cache; or transmit the indication of the altered execution order ([0027] notes an optimization recommendation 134 may be generated and provided to a developer if the improvement at least meets a minimum threshold or satisfies a specific improvement criterion).
As to claim 6, Murphy discloses the execution order indicates that a first graphics processor workload in the first subset of graphics processor workloads, a second graphics processor workload in the first subset of graphics processor workloads, and a third graphics processor workload in the second subset of graphics processor workloads are to be executed sequentially ([0023] notes the series of API calls can be sent in a determined sequence to cause a corresponding sequence of tasks to be performed in order to perform a specific operation, such as to perform a sequence of individual tasks needed to render an image or video frame, among other such types of content, where it obvious that the sequence of tasks includes multiple tasks, thus corresponding to multiple graphics processor workloads), and wherein the altered execution order indicates that the first graphics processor workload, the third graphics processor workload, and the second graphics processor workload are to be executed sequentially ([0025] notes once an RDG is generated for the image or frame, the optimization analyzer 120 analyzes the RDG to attempt to determine any and all possible optimizations, where these optimizations may include reordering tasks, combining tasks, deleting unnecessary tasks or otherwise modifying the timing or positioning of nodes for those tasks in the RDG to be more efficient, while ensuring that the determined dependencies are respected in the optimized RDG).
As to claim 7, Murphy discloses the execution order indicates that a first graphics processor workload in the first subset of graphics processor workloads, a second graphics processor workload in the second subset of graphics processor workloads, a third graphics processor workload in the first subset of graphics processor workloads, and a fourth graphics processor workload in the second subset of graphics processor workloads are to be executed sequentially ([0023] notes the series of API calls can be sent in a determined sequence to cause a corresponding sequence of tasks to be performed in order to perform a specific operation, such as to perform a sequence of individual tasks needed to render an image or video frame, among other such types of content, where it obvious that the sequence of tasks includes multiple tasks, thus corresponding to multiple graphics processor workloads), and wherein the altered execution order indicates that the first graphics processor workload, the second graphics processor workload, the fourth graphics processor workload, and the third graphics processor workload are to be executed sequentially ([0025] notes once an RDG is generated for the image or frame, the optimization analyzer 120 analyzes the RDG to attempt to determine any and all possible optimizations, where these optimizations may include reordering tasks, combining tasks, deleting unnecessary tasks or otherwise modifying the timing or positioning of nodes for those tasks in the RDG to be more efficient, while ensuring that the determined dependencies are respected in the optimized RDG).
As to claim 8, Murphy discloses the set of resources comprises at least one of a set of textures or a set of render targets ([0037] notes at least one compute resource 306, which may obtain or receive data to be used for rendering, may include texture).
As to claim 9, Murphy discloses the processor is further configured to: execute the set of graphics processor workloads based on the altered execution order ([0026] further notes once a new (or optimized) RDG is generated, a frame reconstruction module 124 of the FI 116 submits the sequence of API calls according to the new RDG, or works with content application 102 to cause a new sequence of API calls to be issued according to the new RDG, among other such options, the renderer 104 executing on GPU 108 (or set of GPUs) can then perform the tasks in the new sequence according to the second RDG, and can generate a second or “optimized” image 128, which should contain substantially identical image content to the initial image 126).
As to claim 13, Murphy discloses the obtained indication of the set of graphics processor workloads that are to be executed by the graphics processor in the execution order indicates that the first index of the first subset of graphics processor workloads occurs outside of the threshold index separation from the second index of the second subset of graphics processor workloads ([0027] notes testing of the new reordered RDG, where a comparator 130 of the FI 116 may compare the initial image 126 and the “optimized” image 128 to 1) ensure that the images contain the same or equivalent image content and 2) determine whether the reordered RDG resulted in at least a minimum or threshold level of improvement, whether in rendering time, resource usage, or another such metric, thus considered if the new, optimized, reordered RDG resulted in at least a minimum or threshold level of improvement, the sequence of tasks prior to optimization may be considered “outside of the threshold” level).
As to claim 16, Murphy discloses the apparatus is a wireless communication device comprising at least one of a transceiver or an antenna coupled to the processor (Figure 9 illustrates exemplary system 900 (which system 100 may be implemented) which includes at least wireless transceiver 926 coupled to processor 902 and/or graphics/video card 912, [0076]).
As to claim 17, Murphy discloses a method of graphics processing, comprising the steps as performed by the apparatus of claim 1. Please see the rejection and rationale of claim 1.
Claims 18 and 19 are similar in scope to claims 2 and 3, respectively, and are therefore rejected under similar rationale.
As to claim 20, Murphy discloses a computer-readable medium (Figure 9, memory 920) storing computer executable code (e.g. storing instructions 919 and data 921), the computer executable code, when executed by a processor (e.g. executable on processor 902, further by a GPU, e.g. as described in claim 1, which may be implemented in graphics/video card 912), causes the processor to perform the steps as performed by the apparatus of claim 1. Please see the rejection and rationale of claim 1.
Claim(s) 10-12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Murphy (US 2024/0411607) as applied to claim 1 above, and further in view of McCrary et al. (US 2011/0050713).
As to claim 10, Murphy does not disclose, but McCrary et al. disclose to alter the execution order of the set of graphics processor workloads, the processor is configured to: assign a set of weights to the set of graphics processor workloads ([0024] notes CPU may specify a priority associated with one or more ring buffers and determining a priority level of each command); enqueue the set of graphics processor workloads to a priority queue based on the assigned set of weights ([0024] notes CPU may define one ring buffer for a high priority commands, low priority commands, and low latency commands, where commands may be added to a ring buffer based on the determined priority level of each command, [0041] notes each command buffer enqueued in the ring buffer best matches the priority level of the command); and dequeue the set of graphics processor workloads from the priority queue, wherein the altered execution order is based on the dequeued set of workloads ([0048] notes GPU selects command buffers for execution on the GPU according to priority criteria, e.g. GPU may determine priority ordering in which the subset of ring buffers 210 selected is to be processed and also how commands are prioritized and scheduled during the processing of each ring buffer, [0049] notes the selected commands are executed on GPU according to the priority orderings).
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify Murphy’s system and method of altering the execution order of sets of graphics processor workloads to further incorporate McCrary et al.’s method of assigning priorities (e.g. weights) to workloads such that higher priority workloads may be scheduled and executed prior to lower priority workloads, providing efficiency within the system (see Background of McCrary et al.).
As to claim 11, Murphy modified with McCrary et al. disclose the set of weights includes a first weight associated with the first subset of graphics processor workloads (McCrary, e.g. a lower priority level determined for commands) and a second weight associated with the second subset of graphics processor workloads (McCrary, e.g. a higher priority level determined for commands), and wherein the second weight is greater than the first weight (McCrary, e.g. higher priority greater than lower priority, which causes preemption of commands)(McCrary, [0024] notes CPU may define one ring buffer for a high priority commands, low priority commands, and low latency commands, where commands may be added to a ring buffer based on the determined priority level of each command, [0041] notes each command buffer enqueued in the ring buffer best matches the priority level of the command, [0052] further notes higher priority command buffers may be added to a high priority ring buffer during the execution of one or more lower priority commands, which may cause the GPU to pre-empt one or more lower priority commands to accommodate the higher priority commands).
As to claim 12, Murphy modified with McCrary et al. disclose the set of weights includes a first weight associated with the first subset of graphics processor workloads (McCrary, e.g. a higher priority level determined for commands) and a second weight associated with the second subset of graphics processor workloads (McCrary, e.g. a lower priority level determined for commands), and wherein the second weight is less than the first weight (McCrary, e.g. lower priority is less than higher priority, which causes preemption of commands)(McCrary, [0024] notes CPU may define one ring buffer for a high priority commands, low priority commands, and low latency commands, where commands may be added to a ring buffer based on the determined priority level of each command, [0041] notes each command buffer enqueued in the ring buffer best matches the priority level of the command, [0052] further notes higher priority command buffers may be added to a high priority ring buffer during the execution of one or more lower priority commands, which may cause the GPU to pre-empt one or more lower priority commands to accommodate the higher priority commands).
Claim(s) 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Murphy (US 2024/0411607) as applied to claim 1 above, and further in view of Livesley et al. (US 2023/0377086).
As to claim 14, Murphy does not disclose, but Livesley et al. disclose the altered execution order minimizes a number of off-chip evictions of the set of resources from the graphics memory to system memory ([0039] notes “reordering” work to be rendered, such that work that is close together will be performed together, meaning that accesses to memory that is spatially close together will be performed close together in time, which increases the likelihood that information fetched into a cache for the rendering engine 402 will be reused before being evicted, which reduces the overall number of misses, improves performance, reduces bandwidth in accesses to external memory, and reduces power consumption as a result).
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify Murphy’s system and method of altering the execution order of sets of graphics processor workloads to further incorporate Livesley et al.’s method of altering (e.g. reordering) workloads to keep workloads close that are performed together, such that memory accesses may be reused before being evicted to an external memory to reduce the overall number of misses, improve performance, reduce bandwidth in accesses to external memory, and reduce power consumption as a result (see [0039] of Livesley et al.).
Claim(s) 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Murphy (US 2024/0411607) as applied to claim 1 above, and further in view of Burke et al. (US 10,672,175).
As to claim 15, Murphy discloses to obtain the indication of the set of graphics processor workloads, the processor is configured to obtain the indication of the set of graphics processor workloads from an application ([0023] notes a content application 102 makes a series of calls to at least one interface, such as an application programming interface (API) 106 of a renderer 104, such as may include a rendering application executing on a graphics processing unit (GPU) 108), but does not disclose, but Burke et al. disclose wherein to alter the execution order of the set of graphics processor workloads, the processor is configured to alter the execution order of the set of graphics processor workloads via a graphics processor driver (column 25, lines 25-27 notes method 730 implemented in applications (e.g. through an API) or driver software, where additional text, e.g. column 25, lines 32-46 notes a compiler may check the resource requirement and data address ranges used by the draw calls, and may also determine any order dependency between the draw calls, where the compiler/driver may then re-arrange the order of draw calls to optimize parallel execution and/or to provide more efficient usage of hardware 3D/compute resources, and before and/or during any re-arranging, the driver may ensure that no data dependency is violated).
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify Murphy’s system and method of altering the execution order of sets of graphics processor workloads to further incorporate Burke et al.’s method of altering the execution order of sets of graphics processor workloads via a graphics processor driver as an alternative means of performing the altering, which provides better utilization and performance (see column 25, lines 32-46 of Burke et al.).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Burke et al. (US 10,672,175) disclose an apparatus for graphics processing, comprising: a memory; and a processor coupled to the memory, and the processor is configured to (e.g. Figures 7B and 7C, column 24, lines 50-67; Figures 8B and 8C, column 27, lines 12-28): determine that a first subset of graphics processor workloads in the set of graphics processor workloads produces a set of resources in graphics memory and that a second subset of graphics processor workloads in the set of graphics processor workloads consumes the set of resources in the graphics memory (step 731, determining an order dependency between two or more draw calls, which may include step 733, determining a resource requirement and data address range for the two or more draw calls, and step 737, defining a producer stage and a consumer stage with a queue primitive; step 831 notes determining a work split for two or more work items in an order independent mode, which may include step 836, defining a producer stage and a consumer stage with a queue primitive, where Figures 9B and 9C, column 30, lines 33-48 further notes process of defining a producer stage and a consumer stage with a queue primitive); and alter, based on the determination, the execution order of the set of graphics processor workloads (step 732, reordering the two or more draw calls based on the determined order dependency; step 832, reordering the two or more work items based on the determined work split in the order independent mode).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACINTA M CRAWFORD whose telephone number is (571)270-1539. The examiner can normally be reached 8:30a.m. to 4:30p.m.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, King Y. Poon can be reached at (571)272-7440. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JACINTA M CRAWFORD/Primary Examiner, Art Unit 2617