Office Action Analysis: 18649068 — GRAPHICS PROCESSING

Office Action

§103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of the Claims
		Claims 1-20 are currently pending in the present application, with claims 1, 7, 11, and 20 being independent.
Information Disclosure Statement
	The information disclosure statement (IDS) submitted on 08/07/2024 have been considered by the examiner.
	Claim Objections
The claims 1, 4-6, 11, 14-15, and 20 are objected to because they include reference characters which are not enclosed within parentheses.  
Reference characters corresponding to elements recited in the detailed description of the drawings and used in conjunction with the recitation of the same element or group of elements in the claims should be enclosed within parentheses so as to avoid confusion with other numbers or characters which may appear in the claims.  See MPEP § 608.01(m).

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 1-6, 9-20 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The term “relative” in claims 1, 11, and 20 is a relative term which renders the claim indefinite. The term “relative” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. 
The examiner respectfully requests the applicant to clarify the scope of the claimed invention.
Claims 2, 5, 9, 12, 15, 18 and recites substantially similar subject matter as to that of claims 1, 11, and 20 and is rejected using substantially similar rationale as to that which was set forth with respect to claim 1, 11, and 20.	
Claims depending thereon are also rejected for substantially similar reasons as that set forth for the claims from which they depend on.

The recited “ray tracing budget B”, “M groups of threads” in claim 1, and “N threads” in claim 6 are introduced as mathematical variables without any definition of what these variables represent or what type of quantity they are. The claims do not specify whether B, M, and N must be whole numbers, fractions, real numbers, or some other type of value, and do not set forth any explicit relationship or constraints on these variables. Therefore, it is unclear to one of ordinary skill in the art what constitutes an acceptable value for B, M, and N.
Claim 6 further recites” rounding the determined approximate number of rays to be traced for the region to the nearest multiple of M*N, and dividing this rounded value…” It is unclear what is meant by “nearest multiple of M*N” when M and N themselves are not defined values. It is also unclear whether the “rounded value” refers to the approximate number of rays, the product of M*N, or some other quantity. One of ordinary skill in the art cannot determine with reasonable certainty what operations are required by claim 6.
The examiner respectfully requests the applicant to clarify the scope of the claimed invention.
Claims 4, 5, 11, 14, 15, and 20 recites substantially similar subject matter as to that of claims 1 and 6 and is rejected using substantially similar rationale as to that which was set forth with respect to claim 1 and 6.	
Claims depending thereon are also rejected for substantially similar reasons as that set forth for the claims from which they depend on.
 
The recited  ”comprising repeating the method…” in claim 10 is ambiguous as it is unclear which method, subset of previously-recited steps, is being repeated. Claim 10 is dependent of claim 8, and claim 8 is dependent of claim 7, where both claims 7 and 8 recite multi-step methods. It is therefore uncleaer whether claim 10 requires the repetition of all of the steps of claim 7 and 8, or only a subset.
The examiner respectfully requests the applicant to clarify the scope of the claimed invention.

Claims 1-6 and 9-20 will be examined as best understood by the examiner.
	
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-6, 11-17, and 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jacobson et al., "Spatial Adaptive Sampling in Real Time Ray Tracing", Department of Computer Science, Lund University, (2021), pages 1-67, hereinafter referred to as “Jacobson”, in view of NVIDIA “NVIDIA PTX ISA: NVIDIA CUDA Programming Guide”, Version 8.4 published on March 2, 2024, by NVIDIA Corporation, and in further view of The Dude et al. (2017, March 27). Parallel ray tracing in 16x16 chunks. Stack Overflow. https://stackoverflow.com/questions/43056609/parallel-ray-tracing-in-16x16-chunks, hereinafter referred to as “The Dude”.
	Regarding claim 1, Jacobson discloses a method of operating a graphics processor to generate a render output made up of a plurality of sampling positions by performing a ray tracing process in which rays are traced through a scene to be rendered (Pg. 21, Section 3.1; The ray tracer, a backwards traced path tracer that supports diffuse, specular, and transmissive materials…also uses importance sampling, defined as sampling of rays that affect the estimation. Section 4; implementation of the ray tracer…pipeline stages of the basic mode, then explains how a single ray is sampled (including how importance sampling is performed), and ends with a more thorough look at the de-noising filter implementation. Pg. 25, Section 3.2.3; For the case of real time ray tracers…video footage was captured to compare the rendered scene…The path was chosen on a per scene basis, with the idea that the path should show various parts of the scene, and expose the ray tracing code for different challenges) 
wherein the total number of rays to be traced when generating the render output is based on a ray tracing budget B (Pg. 21, Section 3.3.1; default mode of the ray tracer…uses 1 spp. Pg. 30, Section 4.1.2; The budget for a real time tracer is limited to that of ca 1 spp, it will generate a high frequency noisy ray traced image…where each pixel ray bounces seven times and samples the light source…), and wherein different numbers of rays can be traced for different regions of the render output (Pg. 23, Section 3.1.3; adaptive sampling mode…Figure 3.3 shows that we have sampled points A to E in a 9x9 image. From those values, we see that A, B, C are close enough for us to estimate the values in-between points (red) without having to ray trace them. The next step would then be to trace intersecting points…white pixels are not calculated, red pixels are interpolated pixels, and the rest are ray traced. Pg. 34, Section 4.2; sampling every X'th pixel on the vertical and horizontal axis. That is, we divide the image into mxn grids of size XxX), the method comprising: 
determining relative numbers of rays to be traced for different regions of the render output (Pg. 23, Fig. 3.3 and Pg. 34-36, Section 4.2 and Fig. 4.6; schedule rays…adaptive sampling…each iteration of the schedule-shoot-interpolate passes, we try to interpolate by sampling every X'th pixel on the vertical and horizontal axis. That is, we divide the image into mxn grids of size X x X……Section 4.2.1; To communicate between the ray shooter and the ray scheduler, we work with a texture that says which pixels should be ray traced (queued), and which should not be. That is, for the same size as the application window resolution, we have a texture which operates as a mask…For each time we run the schedule-shoot-interpolate passes, we launch the ray scheduler and the ray tracer program on each pixel),
to perform ray tracing for the subregion (Pg. 35, Section 4.2 and Fig. 4.7; Once we have sampled the corners of the grid size Xi, we half the size and use grid size Xi-1. In other words, we divide each box of the grid into four new overlapping sub-boxes…The figure starts out with a 9x9 grid, then 5x5 grid, and lastly a 3x3 grid),
based on the relative number of rays to be traced for the region of the render output (Pg. 23, Section 3.1.3; adaptive sampling mode…Pg. 34-36, Section 4.2; each iteration of the schedule-shoot-interpolate passes, we try to interpolate by sampling every X'th pixel on the vertical and horizontal axis. That is, we divide the image into mxn grids of size X x X…to communicate between the ray shooter and the ray scheduler, we work with a texture that says which pixels should be ray traced (queued), and which should not be…For each time we run the schedule-shoot-interpolate passes, we launch the ray scheduler and the ray tracer program on each pixel. Fig. 3.3 and Fig. 4.7) and the budget B of rays to be traced when generating the render output (Pg. 21, Section 3.3.1; default mode of the ray tracer…uses 1 spp. Pg. 30, Section 4.1.2; The budget for a real time tracer is limited to that of ca 1 spp, it will generate a high frequency noisy ray traced image…where each pixel ray bounces seven times and samples the light source… Examiner's note: Jacobsen 3.1.1 and 4.1.2 describe a fixed real-time sampling budget, while Jacobsen 3.1.3 and 4.2 show an adaptive sampler that gives some regions more samples than others, effectively allocating the global budget among regions according to their relative importance)
and performing ray tracing for the region (Pg. 27-28, Section 4.1; The core system. Pg. 28-29 and Fig. 4.2; ray tracing system is a backwards traced path tracer with importance sampling, we cast rays from the cameras point of view and let it bounce several times…Pg. 30-31, Section 4.1.2; raw colour values generated per pixel can be seen in figure 4.5, where each pixel ray bounces seven times and samples the light source in the same way illustrated in figure 4.3).
	Jacobson does appear to explicitly disclose allocating M groups of threads to a region of the render output.
	In the same art of GPU parallel computing, NVIDIA discloses allocating M groups of threads to a region of the render output (Pg. 11, Section 2.2.1; CTA, is an array of threads that execute a kernel concurrently or in parallel…Each CTA thread uses its thread identifier to determine its assigned role, assign specific input and output positions, compute addresses, and select work to perform. The thread identifier is a three-element vector tid, (with elemenets tid.x, tid.y, and tid.z) that specifies the thread's position within a 1D, 2D, or 3D CTA…Examiner's note: CTAs have a 1D/2D/3D shape (ntid.x, ntid.y, ntid.z) and each thread's index selects a position within that array, which can map to a block of pixels or a region of the render output (CTAs are launched over 2D data like images)).
	It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to implement the GPU thread-group organization taught by NVDIA into the ray-tracing pipeline of Jacobson. Jacobsen already recites that a large portion of the computation is performed on a massively parallel GPU with multiple cores and threads (Jacobson Pg. 13, Section 2.2), so a person of ordinary skill in the art would see the NVDIA PTX programming guide for conventional techniques on how to organize GPU threads in groups (CTAs) over 2D image regions to execute the ray-tracing techniques efficiently. Combining Jacobson’s parallel GPU workload into standard cooperative thread arrays yield predictable results in providing efficient thread allocation and improve real-time ray-tracing performance.
	Jacobson in view of NVDIA does not disclose to perform the ray tracing for the region of the render output, each thread of the M groups of threads being allocated to a subregion of the region, determining the number of rays to be traced by each thread of the M groups of threads when performing ray tracing for the subregion to which they have been allocated, and including each of the threads tracing the determined number of rays for the subregion to which they have been allocated.
	In the same art of parallel ray tracing, The Dude discloses to perform the ray tracing for the region of the render output, each thread of the M groups of threads being allocated to a subregion of the region (Response from Adrian McCarthy; you create a Manager object that returns a chunk of work (in the form of a Task) each time a thread calls its GetTask method…
std::unique_ptr<Task> Manager::GetTask() {
    std::lock_guard guard(mutex);
    std::unique_ptr<Task> t;
    if (next_row < HEIGHT) {
        t = std::make_unique<Task>(next_row);
        ++next_row;
    }
    return t;
}
…the manager creates a new task to ray trace the next row. (You could use 16x16 bocks instead of rows if you like)…Examiner's note: shows a Manager that, each time a worker thread calls GetTask(), returns a new chunk of work (row or 16x16 block) derived from a shared "next_row" counter. This divides the total rows/pixels (and thus rays) across the available threads, so each thread is assigned to a specific number of pixels/rays for its subregion (each worker thread processes a subregion of the image)),
determining the number of rays to be traced by each thread of the M groups of threads when performing ray tracing for the subregion to which they have been allocated (Response from Adrian McCarthy's; Manager::GetTask()…t = std::make_unique<Task>(next_row);…return t;…the manager creates a new task to ray trace the next row. (You could use 16x16 blocks instead of rows if you like.) When all the tasks have been issued, it just returns an empty pointer, which essentially tells the calling thread that there's nothing left to do, and the calling thread will then exit),
and including each of the threads tracing the determined number of rays for the subregion to which they have been allocated (Response from Adrian McCarthy; void WorkerThread(Manager *manager) {
  while (auto task = manager->GetTask()) {
    task->Execute();
  }
}
Examiner's note: each worker thread runs a loop that repeatedly calls manager to get a Task (row/16x16 block) and executes ray tracing for that particular subregion it has been allocated).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to further incorporate the region-assignment technique of The Dude into the combined system of Jacobsen and NVDIA. The Dude addresses how different image regions take different amounts of time to ray trace, so by applying this known scheduling technique to Jacobsen’s ray-tracing threads, organized as groups as taught by NVDIA, would have been a routine use of familiar load-balancing. The motivation lies in the advantage of improving thread utilization, faster rendering, and improving overall CPU/GPU utilization. 

	Regarding claim 2, Jacobsen in view of NVDIA and in further view of The Dude discloses the method of claim 1, and further discloses wherein the relative number of rays to be traced for different regions of the render output is determined based on data indicating the presence of sampling positions in one or more different regions of the render output that could particularly benefit from receiving more ray tracing samples (Jacobson Pg. 23, Section 3.1.3; the adaptive sampling mode aims to lower the average sampling per pixel count by skipping ray tracing some pixels…From those values, we see that A, B, and C are close enough for us to estimate the values in-between these points (red) without having to ray trace them. Pg. 36-37, Section 4.2.2; When we try to interpolate pixels (that is, estimating new data points from already known data) …we demand that certain parameters need to be similar enough. If they are not similar enough, we schedule new rays…Examiner's note: uses per-pixel data (Section 4.1.2-4.1.4; ray tracing geometry buffer and history buffer) to decide which pixels can be interpolated and which must be ray traced again).
The motivation to combine would’ve been the same as that set forth above with respect to claim 1.

Regarding claim 3, Jacobsen in view of NVDIA and in further view of The Dude discloses the method of claim 2, and further discloses wherein the data indicating the presence of sampling positions in one or more different regions of the render output that could particularly benefit from receiving more ray tracing samples (Jacobson Pg. 23, Section 3.1.3; the adaptive sampling mode aims to lower the average sampling per pixel count by skipping ray tracing some pixels…From those values, we see that A, B, and C are close enough for us to estimate the values in-between these points (red) without having to ray trace them. Pg. 36-37, Section 4.2.2; When we try to interpolate pixels (that is, estimating new data points from already known data) …we demand that certain parameters need to be similar enough. If they are not similar enough, we schedule new rays…Examiner's note: uses per-pixel data (Section 4.1.2-4.1.4; ray tracing geometry buffer and history buffer) to decide which pixels can be interpolated and which must be ray traced again) comprises one or more of: 
data indicating an area of disocclusion for the region (Jacobson Pg. 33, Section 4.1.4; We now need to check if we actually see the same thing from the previous point of view and the current point of view. For each pixel, we compare the history buffer and ray tracing buffers' world positions, world normals, and their object IDs…If all of these criteria are met, we have a successful re-projection…If the re-projection was not successful, we set the history length of that pixel to one. Examiner's note: pixels failing re-projection tests are disoccluded or otherwise newly visible (the implementation flags them by resetting history)),
data indicating an area of specular highlights for the region; data indicating an area of high spatiotemporal variance and/or soft shadows (Jacobson Pg. 21, Section 3.1; The ray tracer, a backwards traced path tracer that supports diffuse, specular, and transmissive materials…Pg. 33, Section 4.1.4; use the variance to process the colour representation of the pixels, we use the RGB luminance representation as the dataset to calculate the variance on…these two values are what we define as the moments of the pixels…due to the rough and noisy data that is the raw ray tracing input, we first estimate the moments spatially, and when the said history length is long enough, we calculate the variance temporally), 
and data from a learned algorithm or neural network, optionally feedback data from a denoiser (Jacobson Pg. 32, Section 4.1.4; Spatio-Temporal Variance Guided Filtering denoising filter… (σz, σl, σn) and a projection limit θp. These can be thought of as the parameters for the filtering process).
The motivation to combine would’ve been the same as that set forth above with respect to claim 1.

Regarding claim 4, Jacobsen in view of NVDIA and in further view of The Dude discloses the method of claim 1, Jacobson further teaches is determined based on a determined number of rays that are to be traced for the region (Jacobsen Section 3.1.3 and Pg. 22, Section 3.1.3; Figure 3.3 shows that we have sampled points A to E in a 9x9 image. From those values, we see that A, B, and C are close enough for us to estimate the values in-between these points…).
Jacobson does not disclose wherein the number M of groups of threads allocated to a region of the render output.
In the same art of parallel GPU computing, NVIDIA discloses wherein the number M of groups of threads allocated to a region of the render output (NVIDIA Pg. 11, Section 2.2.1; CTA, is an array of threads that execute a kernel concurrently or in parallel…Each CTA thread uses its thread identifier to determine its assigned role, assign specific input and output positions, compute addresses, and select work to perform. The thread identifier is a three-element vector tid, (with elemenets tid.x, tid.y, and tid.z) that specifies the thread's position within a 1D, 2D, or 3D CTA…Examiner's note: CTAs have a 1D/2D/3D shape (ntid.x, ntid.y, ntid.z) and each thread's index selects a position within that array, which can map to a block of pixels or a region of the render output (CTAs are launched over 2D data like images)) 
The motivation to combine would’ve been the same as that set forth above with respect to claim 1.

Regarding claim 5, Jacobsen in view of NVDIA and in further view of The Dude discloses the method of claim 1, and further discloses wherein determining the number of rays to be traced by each thread of the M groups of threads when performing ray tracing for the subregion to which they have been allocated comprises:  
determining an approximate number of rays to be traced for the region of the render output by multiplying the relative number of rays to be traced for the region by the ray tracing budget B, divided by the sum of all the relative numbers of rays to be traced for all of the regions of the render output (Jacobson Pg. 21, Section 3.3.1; default mode of the ray tracer…uses 1 spp. Pg. 30, Section 4.1.2; The budget for a real time tracer is limited to that of ca 1 spp, it will generate a high frequency noisy ray traced image…where each pixel ray bounces seven times and samples the light source…Pg. 23, Section 3.1.3; the adaptive sampling mode aims to lower the average sampling per pixel count by skipping ray tracing some pixels. Examiner's note: states a fixed sampling budget, then uses the adaptive sampling rules to give some regions more sampling and others none, showing importance sampling and distributing the global budget across region in correspondence to that importance).
The motivation to combine would’ve been the same as that set forth above with respect to claim 1.

Regarding claim 6, Jacobsen in view of NVIDIA and in further view of The Dude discloses the method of claim 5, Jacobson does not disclose wherein each of the M groups of threads comprises N threads.
In the same art of parallel GPU computing, NVIDIA discloses wherein each of the M groups of threads comprises N threads (NVIDIA Pg. 11, Section 2.2.1; The vector ntid specifies the number of threads in each CTA dimension).
The motivation to combine would’ve been the same as that set forth above with respect to claim 1.
Jacobson in view of NVIDIA does not disclose and the method further comprises rounding the determined approximate number of rays to be traced for the region to a nearest multiple of M*N and dividing this rounded value by M*N to give the number of rays to be traced by each of the M*N threads when performing ray tracing for the subregion to which they have been allocated.
In the same art of parallel ray tracing, The Dude discloses and the method further comprises rounding the determined approximate number of rays to be traced for the region to a nearest multiple of M*N (The Dude's Code snippet; auto size = WIDTH*HEIGHT;), and dividing this rounded value by M*N  (The Dude's Code snippet; auto chunk = size / nThreads;) to give the number of rays to be traced by each of the M*N threads when performing ray tracing for the subregion to which they have been allocated (The Dude's post; dividing the image into as many chunks as the system has and rendering them parallel).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine Jacobson’s per-region ray tracing budgets with the GPU execution model as taught by NVIDIA and the per-thread division of work by rounding each region’s ray count to a multiple of the total thread MxN and then dividing that amount evenly among the thead as taught by The Dude. Matching work counts to warp/group sizes and balancing per-thread workload is a standard optimization that avoids idle threads, maximizes occupancy, and overall improves throughput.

Regarding claim 11, claim 11 is the system claim (Jacobson Section 2.2 Technical Details) of method claim 1 and is accordingly rejected using substantially similar rationale as to that which is set for with respect to claim 1. 

Regarding claim 12, claim 12 has similar limitations as of claim 2, except it is a system claim (Jacobson Section 2.2 Technical Details), therefore it is rejected under the same rationale as claim 2.

Regarding claim 13, claim 13 has similar limitations as of claim 3, except it is a system claim (Jacobson Section 2.2 Technical Details), therefore it is rejected under the same rationale as claim 3.

Regarding claim 14, claim 14 has similar limitations as of claim 4, except it is a system claim (Jacobson Section 2.2 Technical Details), therefore it is rejected under the same rationale as claim 4.

Regarding claim 15, claim 15 has similar limitations as of claim 5, except it is a system claim (Jacobson Section 2.2 Technical Details), therefore it is rejected under the same rationale as claim 5.

	Regarding claim 16, Jacobson in view of NVDIA and in further view of The Dude discloses the graphics processor of claim 11, Jacobson further discloses wherein each of the subregions of the region comprises a plurality of sampling positions (Pg. 35, Section 4.2 and Fig. 4.7; Once we have sampled the corners of the grid size Xi, we half the size and use grid size Xi-1. In other words, we divide each box of the grid into four new overlapping sub-boxes…The figure starts out with a 9x9 grid, then 5x5 grid, and lastly a 3x3 grid), 
	Jacobson in view of NVDIA does not disclose and each thread traces the determined number of rays for the subregion by cycling over sampling positions of its allocated subregion in turn to trace one or more rays for one or more of the sampling positions of the subregion.
	In the same art of parallel ray tracing, The Dude discloses and each thread traces the determined number of rays for the subregion by cycling over sampling positions of its allocated subregion in turn to trace one or more rays for one or more of the sampling positions of the subregion (The Dude Response from Adrian McCarthy; void WorkerThread(Manager *manager) {
  while (auto task = manager->GetTask()) {
    task->Execute();
  }
}
Examiner's note: each worker thread runs a loop that repeatedly calls manager to get a Task (row/16x16 block) and executes ray tracing for that particular subregion it has been allocated).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to implement The Dude’s multi-threaded work-distribution technique into Jacobson’s and NVDIA’s combined system. Doing so allows Jacobson’s scene-dependent ray-tracing to be dynamically fed to worker threads, yielding predictable results in improving parallel CPU/GPU utilization, balance the load among threads, and reducing render time. 

Regarding claim 17, Jacobson in view of NVDIA and in further view of The Dude discloses the graphics processor of claim 16, Jacobson further discloses is not a multiple of the number of sampling positions that each subregion comprises (Pg. 35, Section 4.2.1; Figure 4.7 shows which pixels are ray traced for the adaptive sampler with initial grid size of 9x9 at different stages. Orange pixels are ray traced in the first step, blue in the second step, and green in the third. Red pixels shows the pixel we are trying to calculate. Red box shows which area the red pixel looks at in the different stages (but does not use the white pixels). Examiner's note: within a 9x9 subregion, only some pixels are traced in each pass, and the rest are white/interpolated. When the adaptive sampling technique is executed by a thread for its subregion, the total number of rays that thread traces for that subregion equals the number of traced pixels, not all sampling positions. For example, tracing 5 pixels in a 9x9 subregion yields 5 rays, which is not a multiple of 81); 
and each thread traces a different number of rays for one or more sampling positions of the subregion compared to other sampling positions of the subregion, based on the order that the thread cycles over the sampling positions (Pg. 35, Section 4.2 and Fig. 4.7; Once we have sampled the corners of the grid size Xi, we half the size and use grid size Xi-1. In other words, we divide each box of the grid into four new overlapping sub-boxes… Figure 4.7 shows which pixels are ray traced for the adaptive sampler with initial grid size of 9x9 at different stages. Orange pixels are ray traced in the first step, blue in the second step, and green in the third. Red pixels shows the pixel we are trying to calculate. Red box shows which area the red pixel looks at in the different stages (but does not use the white pixels).
Jacobson in view of NVIDIA does not disclose wherein the number of rays to be traced by each of the threads for the subregion to which they have been allocated.
In the same art of parallel ray tracing, The Dude discloses wherein the number of rays to be traced by each of the threads for the subregion to which they have been allocated (The Dude Response from Adrian McCarthy; the manager creates a new task to ray trace the next row. (You could use 16x16 blocks instead of rows if you like.))
The motivation to combine would’ve been the same as that set forth above with respect to claim 1.

Regarding claim 19, Jacobson in view of NVDIA and in further view of The Dude discloses the graphics processor of claim 11, and Jacobson further discloses wherein the graphics processor is configured to (Jacobson Section 2.2 Technical Details), when successively generating one or more plural render outputs having corresponding regions (Pg. 25, Section 3.2.3; For the case of real time ray tracers…video footage was captured to compare the rendered scene…The path was chosen on a per scene basis, with the idea that the path should show various parts of the scene, and expose the ray tracing code for different challenges. Pg. 25, Section 3.2.4; The same paths for when the camera was moving was used in all measurements, that is, the path used for frame generation measurement is the same path used when capturing video footage of the rendered scene. Examiner's note: renders the same scenes along fixed camera paths and uses history buffers to align pixels frame-toframe, corresponding regions and subregions across successive frames), each corresponding region comprising a corresponding set of subregions (Pg. 35, Section 4.2 and Fig. 4.7; Once we have sampled the corners of the grid size Xi, we half the size and use grid size Xi-1. In other words, we divide each box of the grid into four new overlapping sub-boxes…The figure starts out with a 9x9 grid, then 5x5 grid, and lastly a 3x3 grid), and when generating successive render outputs (Pg. 25, Section 3.2.3; For the case of real time ray tracers…video footage was captured to compare the rendered scene…The path was chosen on a per scene basis, with the idea that the path should show various parts of the scene, and expose the ray tracing code for different challenges. Pg. 25, Section 3.2.4; The same paths for when the camera was moving was used in all measurements, that is, the path used for frame generation measurement is the same path used when capturing video footage of the rendered scene. Examiner's note: renders the same scenes along fixed camera paths and uses history buffers to align pixels frame-toframe, corresponding regions and subregions across successive frames).
Jacobson in view of NVIDIA does not disclose each corresponding subregion being allocated to a same thread, and cycle the sampling position at which each thread starts cycling over the sampling positions of each corresponding subregion to which it is allocated.
In the same art of parallel ray tracing, The Dude discloses each corresponding subregion being allocated to a same thread (Response from Adrian McCarthy; t = std::make_unique<Task>(next_row); ++next_row; … the manager creates a new task to ray trace the next row. (You could use 16x16 blocks instead of rows if you like. Examiner’s note: worker thread based on per row or per block), and cycle the sampling position at which each thread starts cycling over the sampling positions of each corresponding subregion to which it is allocated (The Dude Response from Adrian McCarthy; void WorkerThread(Manager *manager) {
  while (auto task = manager->GetTask()) {
    task->Execute();
  }
}
…the manager creates a new task to ray trace the next row. (You could use 16x16 blocks instead of rows if you like. Examiner's note: each worker thread runs a loop that repeatedly calls manager to get a Task (row/16x16 block) and executes ray tracing for that particular subregion it has been allocated),
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to implement The Dude’s multi-threaded work-distribution technique into Jacobson’s and NVIDIA’s combined system. Doing so allows uniform sampling across frames, improving the uniformity and consistency of the sequence of render outputs without increasing ray-tracing workload.

Regarding claim 20, claim 11 is the CRM claim (Jacobson Section 2.2 Technical Details) of method claim 1 and is accordingly rejected using substantially similar rationale as to that which is set for with respect to claim 1. 

Claim(s) 7-8, and 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jacobson et al., "Spatial Adaptive Sampling in Real Time Ray Tracing", Department of Computer Science, Lund University, (2021), pages 1-67, hereinafter referred to as “Jacobson”, in view of The Dude et al. (2017, March 27). Parallel ray tracing in 16x16 chunks. Stack Overflow. https://stackoverflow.com/questions/43056609/parallel-ray-tracing-in-16x16-chunks, hereinafter referred to as “The Dude”.
Regarding claim 7, Jacobson discloses a method of operating a graphics processor to generate a render output made up of a plurality of sampling positions by performing a ray tracing process in which rays are traced through a scene to be rendered (Pg. 21, Section 3.1; The ray tracer, a backwards traced path tracer that supports diffuse, specular, and transmissive materials…also uses importance sampling, defined as sampling of rays that affect the estimation. Section 4; implementation of the ray tracer…pipeline stages of the basic mode, then explains how a single ray is sampled (including how importance sampling is performed), and ends with a more thorough look at the de-noising filter implementation. Pg. 25, Section 3.2.3; For the case of real time ray tracers…video footage was captured to compare the rendered scene…The path was chosen on a per scene basis, with the idea that the path should show various parts of the scene, and expose the ray tracing code for different challenges), the method comprising:
each of the subregions of the region comprising a plurality of sampling positions (Pg. 35, Section 4.2 and Fig. 4.7; Once we have sampled the corners of the grid size Xi, we half the size and use grid size Xi-1. In other words, we divide each box of the grid into four new overlapping sub-boxes…The figure starts out with a 9x9 grid, then 5x5 grid, and lastly a 3x3 grid),
and performing ray tracing for the region (Pg. 27-28, Section 4.1; The core system. Pg. 28-29 and Fig. 4.2; ray tracing system is a backwards traced path tracer with importance sampling, we cast rays from the cameras point of view and let it bounce several times…Pg. 30-31, Section 4.1.2; raw colour values generated per pixel can be seen in figure 4.5, where each pixel ray bounces seven times and samples the light source in the same way illustrated in figure 4.3).
Jacobson fails to disclose allocating a plurality of threads to a region of the render output, each thread being allocated to a subregion of the region to perform ray tracing for the subregion, and including each of the threads tracing rays for its allocated subregion by cycling over sampling positions of its allocated subregion in turn to trace one or more rays for one or more of the sampling positions of the subregion
In the same art of parallel ray tracing, The Dude discloses allocating a plurality of threads to a region of the render output, each thread being allocated to a subregion of the region to perform ray tracing for the subregion, 
(Response from Adrian McCarthy; you create a Manager object that returns a chunk of work (in the form of a Task) each time a thread calls its GetTask method…
std::unique_ptr<Task> Manager::GetTask() {
    std::lock_guard guard(mutex);
    std::unique_ptr<Task> t;
    if (next_row < HEIGHT) {
        t = std::make_unique<Task>(next_row);
        ++next_row;
    }
    return t;
}
the manager creates a new task to ray trace the next row. (You could use 16x16 bocks instead of rows if you like)…Examiner's note: shows a Manager that, each time a worker thread calls GetTask(), returns a new chunk of work (row or 16x16 block) derived from a shared "next_row" counter. This divides the total rows/pixels (and thus rays) across the available threads, so each thread is assigned to a specific number of pixels/rays for its subregion (each worker thread processes a subregion of the image)),
including each of the threads tracing rays for its allocated subregion by cycling over sampling positions of its allocated subregion in turn to trace one or more rays for one or more of the sampling positions of the subregion (Response from Adrian McCarthy; void WorkerThread(Manager *manager) {
  while (auto task = manager->GetTask()) {
    task->Execute();
  }
}
Examiner's note: each worker thread runs a loop that repeatedly calls manager to get a Task (row/16x16 block) and executes ray tracing for that particular subregion it has been allocated).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to implement The Dude’s multi-threaded work-distribution technique into Jacobson’s ray tracing system. Doing so allows Jacobson’s scene-dependent ray-tracing to be dynamically fed to worker threads, yielding predictable results in improving parallel CPU/GPU utilization, balance the load among threads, and reducing render time. 

Regarding claim 8, Jacobson in view of The Dude discloses the method of claim 7, and Jacobson further discloses is not a multiple of the number of sampling positions that each subregion comprises (Pg. 35, Section 4.2.1; Figure 4.7 shows which pixels are ray traced for the adaptive sampler with initial grid size of 9x9 at different stages. Orange pixels are ray traced in the first step, blue in the second step, and green in the third. Red pixels shows the pixel we are trying to calculate. Red box shows which area the red pixel looks at in the different stages (but does not use the white pixels). Examiner's note: within a 9x9 subregion, only some pixels are traced in each pass, and the rest are white/interpolated. When the adaptive sampling technique is executed by a thread for its subregion, the total number of rays that thread traces for that subregion equals the number of traced pixels, not all sampling positions. For example, tracing 5 pixels in a 9x9 subregion yields 5 rays, which is not a multiple of 81); 
and each thread tracing a different number of rays for one or more sampling positions of the subregion compared to other sampling positions of the subregion, based on the order that the thread cycles over the sampling positions (Pg. 35, Section 4.2 and Fig. 4.7; Once we have sampled the corners of the grid size Xi, we half the size and use grid size Xi-1. In other words, we divide each box of the grid into four new overlapping sub-boxes… Figure 4.7 shows which pixels are ray traced for the adaptive sampler with initial grid size of 9x9 at different stages. Orange pixels are ray traced in the first step, blue in the second step, and green in the third. Red pixels shows the pixel we are trying to calculate. Red box shows which area the red pixel looks at in the different stages (but does not use the white pixels).
Jacobson does not disclose wherein the number of rays to be traced by each of the threads for the subregion to which they have been allocated.
In the same art of parallel ray tracing, The Dude discloses wherein the number of rays to be traced by each of the threads for the subregion to which they have been allocated (The Dude Response from Adrian McCarthy; the manager creates a new task to ray trace the next row. (You could use 16x16 blocks instead of rows if you like.))
The motivation to combine would’ve been the same as that set forth above with respect to claim 1.

Regarding claim 10, Jacobson in view of The Dude discloses the method of claim 8, and Jacobson further discloses wherein the graphics processor is configured to (Jacobson Section 2.2 Technical Details), repeating the method to successively generate one or more further render outputs having corresponding regions (Pg. 25, Section 3.2.3; For the case of real time ray tracers…video footage was captured to compare the rendered scene…The path was chosen on a per scene basis, with the idea that the path should show various parts of the scene, and expose the ray tracing code for different challenges. Pg. 25, Section 3.2.4; The same paths for when the camera was moving was used in all measurements, that is, the path used for frame generation measurement is the same path used when capturing video footage of the rendered scene. Examiner's note: renders the same scenes along fixed camera paths and uses history buffers to align pixels frame-toframe, corresponding regions and subregions across successive frames), each corresponding region comprising a corresponding set of subregions (Pg. 35, Section 4.2 and Fig. 4.7; Once we have sampled the corners of the grid size Xi, we half the size and use grid size Xi-1. In other words, we divide each box of the grid into four new overlapping sub-boxes…The figure starts out with a 9x9 grid, then 5x5 grid, and lastly a 3x3 grid), and when generating successive render outputs (Pg. 25, Section 3.2.3; For the case of real time ray tracers…video footage was captured to compare the rendered scene…The path was chosen on a per scene basis, with the idea that the path should show various parts of the scene, and expose the ray tracing code for different challenges. Pg. 25, Section 3.2.4; The same paths for when the camera was moving was used in all measurements, that is, the path used for frame generation measurement is the same path used when capturing video footage of the rendered scene. Examiner's note: renders the same scenes along fixed camera paths and uses history buffers to align pixels frame-toframe, corresponding regions and subregions across successive frames).
Jacobson does not disclose each corresponding subregion being allocated to a same thread, and each thread cycling the sampling position at which each thread starts cycling over the sampling positions of each corresponding subregion to which it is allocated.
In the same art of parallel ray tracing, The Dude discloses each corresponding subregion being allocated to a same thread (Response from Adrian McCarthy; t = std::make_unique<Task>(next_row); ++next_row; … the manager creates a new task to ray trace the next row. (You could use 16x16 blocks instead of rows if you like. Examiner’s note: worker thread based on per row or per block), and each thread cycling the sampling position at which each thread starts cycling over the sampling positions of each corresponding subregion to which it is allocated (The Dude Response from Adrian McCarthy; void WorkerThread(Manager *manager) {
  while (auto task = manager->GetTask()) {
    task->Execute();
  }
}
…the manager creates a new task to ray trace the next row. (You could use 16x16 blocks instead of rows if you like. Examiner's note: each worker thread runs a loop that repeatedly calls manager to get a Task (row/16x16 block) and executes ray tracing for that particular subregion it has been allocated),
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to implement The Dude’s multi-threaded work-distribution technique into Jacobson’s ray tracing system. Doing so allows uniform sampling across frames, improving the uniformity and consistency of the sequence of render outputs without increasing ray-tracing workload.

Claim(s) 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jacobson et al., "Spatial Adaptive Sampling in Real Time Ray Tracing", Department of Computer Science, Lund University, (2021), pages 1-67, hereinafter referred to as “Jacobson”, in view of The Dude et al. (2017, March 27). Parallel ray tracing in 16x16 chunks. Stack Overflow. https://stackoverflow.com/questions/43056609/parallel-ray-tracing-in-16x16-chunks, hereinafter referred to as “The Dude”, and in further view of Shirley et al. Ray tracing in one weekend, December 2020. https://raytracing. github.io/books/RayTracingInOneWeekend.html [Online; Sourced 2021-07- 12], hereinafter referred to as “Shirley”.
Regarding claim 9, Jacobson in view of The Dude disclose the method of claim 8, and further discloses comprising each thread starting the cycling over the sampling positions of its allocated subregion (The Dude Response from Adrian McCarthy; void WorkerThread(Manager *manager) {
  while (auto task = manager->GetTask()) {
    task->Execute();
  }
}
…the manager creates a new task to ray trace the next row. (You could use 16x16 blocks instead of rows if you like. Examiner's note: each worker thread runs a loop that repeatedly calls manager to get a Task (row/16x16 block) and executes ray tracing for that particular subregion it has been allocated).
Jacobson and The Dude are combined for the reason set forth above with respect to claim 7.
Jacobson in view of The Dude does not disclose at a random sampling position of the subregion relative to the sampling position at which each other thread starts its cycle over the sampling positions of its allocated subregion.
In the same art of ray tracing, Shirley discloses at a random sampling position of the subregion relative to the sampling position at which each other thread starts its cycle over the sampling positions of its allocated subregion (Section 8.2; pixel_sample_square() that generates a random sample point within the unit square centered at the origin.…
  vec3 pixel_sample_square() const {
        // Returns a random point in the square surrounding a pixel at the origin.
        auto px = -0.5 + random_double();
        auto py = -0.5 + random_double();
        return (px * pixel_delta_u) + (py * pixel_delta_v);
    }).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the multi-threaded subregion processing of Jacobson in view of The Dude with Shirley’s random sampling positions. Randomized sampling or Monte Carlo methods is a known technique in adaptive sampling, and yields predictable results in effective anti-aliasing, reduced bias, and provide a more realistic output.

Claim(s) 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jacobson et al., "Spatial Adaptive Sampling in Real Time Ray Tracing", Department of Computer Science, Lund University, (2021), pages 1-67, hereinafter referred to as “Jacobson”, in view of NVIDIA “NVIDIA PTX ISA: NVIDIA CUDA Programming Guide”, Version 8.4 published on March 2, 2024, by NVIDIA Corporation, in further view of The Dude et al. (2017, March 27). Parallel ray tracing in 16x16 chunks. Stack Overflow. https://stackoverflow.com/questions/43056609/parallel-ray-tracing-in-16x16-chunks, hereinafter referred to as “The Dude”, and in further view of Shirley et al. Ray tracing in one weekend, December 2020. https://raytracing. github.io/books/RayTracingInOneWeekend.html [Online; Sourced 2021-07- 12], hereinafter referred to as “Shirley”.
Regarding claim 18, Jacobson in view of NVDIA and in further view of The Dude discloses the graphics processor of claim 17, and further discloses wherein each thread starts the cycling over the sampling positions of its allocated subregion (The Dude Response from Adrian McCarthy; void WorkerThread(Manager *manager) {
  while (a
Read full office action
GRAPHICS PROCESSING

This examiner grants 20% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

GRAPHICS PROCESSING

This examiner grants 20% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email