Last updated: May 29, 2026

Application No. 17/957,508

Selecting a Tiling Scheme for Processing Instances of Input Data Through a Neural Netwok

Final Rejection §102§103

Filed

Sep 30, 2022

Examiner

BOSTWICK, SIDNEY VINCENT

Art Unit

2124

Tech Center

2100 — Computer Architecture & Software

Assignee

Ati Technologies Ulc

OA Round

2 (Final)

Interview Optional

— +38.0% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 51% grant rate with +38.0% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 138 resolved cases, 2023–2026

Examiner Intelligence

BOSTWICK, SIDNEY VINCENT View full profile →

Grants 51% of resolved cases

Career Allowance Rate

71 granted / 138 resolved

-3.6% vs TC avg

Strong +38% interview lift

Without

With

+38.0%

Interview Lift

resolved cases with interview

Typical timeline

4y 5m

Avg Prosecution

45 currently pending

Career history

207

Total Applications

across all art units

Statute-Specific Performance

§101

2.5%

-37.5% vs TC avg

§103

93.4%

+53.4% vs TC avg

§102

1.4%

-38.6% vs TC avg

§112

2.6%

-37.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 138 resolved cases

Office Action

§102 §103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Remarks
This Office Action is responsive to Applicants' Amendment filed on November 5, 2025, in which claims 1, 4, 13, and 16 are currently amended. Claims 1-23 are currently pending.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on October 17, 2025 and October 29, 2025 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Response to Arguments
The rejections to claims 5, 6, 8, 17, 18, and 20 under 35 U.S.C. § 112(b) are hereby withdrawn, as necessitated by applicant's amendments and remarks made to the rejections.

Applicant’s arguments with respect to rejection of claims 1-23 under 35 U.S.C. 101 based on amendment have been considered and are persuasive. 


Applicant’s arguments with respect to rejection of claims 1-23 under 35 U.S.C. 102/103 based on amendment have been considered, however, are not persuasive.
Applicant admits on p. 10 of the Remarks submitted 11/5/2025 that Guan "analyzes model description, and extracts topological structure and operations of the target model" (neural-network characteristics) and "allocates hardware resources reasonably, due to the limitation on computation resource and DRAM storage of modern FPGAs" (processing-circuitry properties).  Examiner notes that Guan does this using several tiling schemes (Im2col, row-major, and channel-major) using different rules (described in detail in §IVC of Guan) for relating the neural-network characteristics with processing-circuitry properties, and selects the channel-major tiling scheme to process instances of input data.
With respect to Applicant's arguments on p. 10 of the Remarks submitted 11/5/2025 that "Nowhere does Guan disclose or suggest a plurality of distinct tiling schemes (e.g. line-buffer, patch, or layer-based tiling)" Examiner notes that the instant claims do not require line-buffer, patch, or layer-based tiling, nor would it be reasonable to limit the scope of "tiling schemes" so narrowly in view of the instant claims.
	For at least these reasons Examiner respectfully asserts that the rejection in view of Guan is very reasonable and should be maintained.  
For the same reasons, the rejections in view of the combination of Guan and Ghasemi are reasonable and should be maintained.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

	Claims 1-4, 6-16, and 19-23 are rejected under U.S.C. §103 as being unpatentable over the combination of Guan ("FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates", 2017).  

	 Regarding claim 1, Guan teaches An electronic device, comprising: processing circuitry configured to: acquire information about a neural network and properties of the processing circuitry;([p. 154 §III] "The Model Mapper uses the model description to extract information about model structure and configurations of each layer. Although storing model parameters and intermediate results in on-chip BRAM can significantly improve performance, we do not have enough on-chip BRAMs on a single FPGA to store all of them for modern DNNs. As a result, we have to allocate data buffers in off-chip DDR memory for storing intermediate activations and model parameters. Then, Model Mapper generates an Execution Graph shown in Figure 2c, which shows ideally how the model inference is performed on hardware" See also FIG. 1)
	process instances of input data in the neural network using the given tiling scheme ([p. 155 §IV] "MM takes in two tiles of input matrices, and performs the tiled multiplication vector by vector" [p. 152] "We implement accelerators with RTL-HLS hybrid templates, and convert model inference into general-purpose computations like matrix multiplication")
	the given tiling scheme having been selected from among a set of tiling schemes using a set of tiling-scheme rules that relate combinations of neural-network characteristics and processing-circuitry properties to respective tiling schemes([p. 155 §IIIC] "HW Generator is in fact a library of RTL-HLS hybrid templates for various types of layers. We use RTL-HLS hybrid templates instead of pure RTL templates or pure HLS templates [...] With the kernel configuration generated by Model Mapper, HW Generator instantiates the corresponding optimized kernel template to generate the hardware codes for Device"  Template library interpreted as synonymous with a set of tiling schemes.  See also FIG. 4 which shows an alternative set of tiling schemes of which the author explicitly selects FIG. 4c for convolutional layers.  FIG. 4c being the channel-major rule relating combinations of neural network characteristics (feature maps) to processing circuitry DRAM layout.).

	 Regarding claim 2, Guan teaches The electronic device of claim 1, wherein each tiling scheme of the set of tiling schemes is associated with a different arrangement of portions into which instances of input data are divided for processing in the neural network.(Guan [p. 156] "For communication optimizations in convolutional layers, we use Figure 4 as a simplified example to illustrate the problems and our solutions [...] In Figure 4, we show three different layout schemes for comparison: Im2col, Row-major and Channel-major. For each scheme, we show its DRAM layout and DRAM accessing pattern for the first tile" [p. 157 §IVC] "Channel-major: Different from Row-major, Channel-major stores Input Features in a channel-major manner. Thus, Input Matrix needs to be reorganized correspondingly: each row
(input patch) is also flattened in a channel-major manner. The contents of reorganized first tile is also shown in Figure 4c. So there are in total 72 elements stored in DRAM for Input Features without any data duplication. And DRAM is also accessed continuously for fetching input elements. Furthermore, in this scheme, the outputs are also generated in a channel-major manner, which indicates no extra operations for data reorganizing or duplication are needed." See FIG. 4).
	
	 Regarding claim 3, Guan teaches The electronic device of claim 2, wherein, when processing the instances of input data using the given tiling scheme, the processing circuitry configured to: for each instance of input data: divide the instance of input data into a plurality of portions based at least in part on the arrangement of portions associated with the given tiling scheme;(Guan [p. 156] "For communication optimizations in convolutional layers, we use Figure 4 as a simplified example to illustrate the problems and our solutions [...] In Figure 4, we show three different layout schemes for comparison: Im2col, Row-major and Channel-major. For each scheme, we show its DRAM layout and DRAM accessing pattern for the first tile" [p. 157 §IVC] "Channel-major: Different from Row-major, Channel-major stores Input Features in a channel-major manner. Thus, Input Matrix needs to be reorganized correspondingly: each row (input patch) is also flattened in a channel-major manner. The contents of reorganized first tile is also shown in Figure 4c. So there are in total 72 elements stored in DRAM for Input Features without any data duplication. And DRAM is also accessed continuously for fetching input elements. Furthermore, in this scheme, the outputs are also generated in a channel-major manner, which indicates no extra operations for data reorganizing or duplication are needed." See FIG. 4)
	process each of one or more portions to generate a respective output for the one or more portions; and (Guan [p. 157 §IVC] "So there are in total 72 elements stored in DRAM for Input Features without any data duplication. And DRAM is also accessed continuously for fetching input elements. Furthermore, in this scheme, the outputs are also generated in a channel-major manner")
	combine the respective outputs to generate an output for the instance of input data.(Guan [p. 155 §IVA] "MM takes in two tiles of input matrices, and performs the tiled multiplication vector by vector. All the input data are fed into multipliers simultaneously, then the intermediate results are summed up").
	
	 Regarding claim 4, Guan teaches The electronic device of claim 2, wherein the set of tiling schemes includes two or more of: a line buffer processing tiling scheme; a patch processing tiling scheme; and(Guan [p. 156 §IV] "Firstly, we need to turn the input features from a 3-D array into a 2-D array that we can calculate as a matrix. To get a single feature in an output channel, we need to convolve a 3-D cube of input features (also known as a patch) with the corresponding convolution kernels. So we take each one of these input patches and flatten them into a single row of input matrix")
	a layer processing tiling scheme.(Guan [p. 154 §IIB] "W Generator is in fact a library of RTL-HLS hybrid templates for various types of layers." [p. 155 §IV] "we divide the operations involved in each layer into computation-intensive part and layer-specific part").
	
	 Regarding claim 6, Guan teaches The electronic device of claim 4, wherein, for the patch processing tiling scheme: the portions of the instances of input data are patches from among a plurality of patches in the instances of input data; and patch processing is used for processing patches from the instances of input data.(Guan [p. 156 §IV] "Firstly, we need to turn the input features from a 3-D array into a 2-D array that we can calculate as a matrix. To get a single feature in an output channel, we need to convolve a 3-D cube of input features (also known as a patch) with the corresponding convolution kernels. So we take each one of these input patches and flatten them into a single row of input matrix").
	
	 Regarding claim 7, Guan teaches The electronic device of claim 6, using the patch processing tiling scheme includes determining one or more of: a size and/or shape of the patches; and an overlap of each patch with neighboring patches.(Guan [p. 156 §IV] "Firstly, we need to turn the input features from a 3-D array into a 2-D array that we can calculate as a matrix. To get a single feature in an output channel, we need to convolve a 3-D cube of input features (also known as a patch) with the corresponding convolution kernels. So we take each one of these input patches and flatten them into a single row of input matrix").
	
	 Regarding claim 8, Guan teaches The electronic device of claim 4, wherein, for the layer processing tiling scheme: the portions of the instances of input data are channels or other subdivisions from among a plurality of channels in the instances of input data; and layer processing is used for processing groups of two or more channels or other subdivisions of the instances of input data.(Guan [p. 155 §4] "According to the rules of matrix multiplication, each output channel is serialized into a column of output matrix" [p. 156 §IV] "Convolutional layers: For communication optimizations in convolutional layers, we use Figure 4 as a simplified example to illustrate the problems and our solutions. In this example, we set the number of input channels as 8, and each channel has 3×3 elements, so we get 72 input elements in total").
	
	 Regarding claim 9, Guan teaches The electronic device of claim 1, wherein the information about the neural network includes information about one or more of: an internal arrangement of the neural network; properties of filters used in the neural network; feature sizes for the neural network; and channel sizes for the neural network. (Guan [p. 155 §4] "According to the rules of matrix multiplication, each output channel is serialized into a column of output matrix" [p. 156 §IV] "Convolutional layers: For communication optimizations in convolutional layers, we use Figure 4 as a simplified example to illustrate the problems and our solutions. In this example, we set the number of input channels as 8, and each channel has 3×3 elements, so we get 72 input elements in total").
	
	 Regarding claim 10, Guan teaches The electronic device of claim 1, wherein the information about the neural network includes information about one or more of:properties of instances of input data to be processed in the neural network; and  properties of outputs of the neural network.(Guan [p. 155 §IVB] "Convolutional Layers: Convolutional layers are overwhelmingly popular in applications like image recognition, object detection, object classification, etc. Suppose we have Nin input channels and Nout output channels").
	
	 Regarding claim 11, Guan teaches The electronic device of claim 1, wherein the information about the properties of the processing circuitry includes information about one or more of: an amount of local memory available for storing data by the processing circuitry; (Guan [p. 154 §IIIB] "we do not have enough on-chip BRAMs on a single FPGA to store all of them for modern DNNs. As a result, we have to allocate data buffers in off-chip DDR memory for storing intermediate activations and model parameters")
	and a processing capacity of the processing circuitry.(Guan [p. 157 §V] "the FPGA platform, we use Catapult [18] system with Altera Stratix-V GSMD5 FPGAs integrated. We use the PikesPeak version of Catapult in our experiments, which has a 4GB DDR3 DRAM as the external memory. The FPGA logic clock frequency is at 150MHz, and the run-time power of the FPGA board is about 25W. This FPGA board is plugged into a PCI-e Gen2 x8 slot of a host computer").
	
	 Regarding claim 12, Guan teaches The electronic device of claim 1, wherein: the processing circuitry includes one or more processors; and(Guan [p. 157 §V] "the FPGA platform, we use Catapult [18] system with Altera Stratix-V GSMD5 FPGAs integrated. We use the PikesPeak version of Catapult in our experiments, which has a 4GB DDR3 DRAM as the external memory. The FPGA logic clock frequency is at 150MHz, and the run-time power of the FPGA board is about 25W. This FPGA board is plugged into a PCI-e Gen2 x8 slot of a host computer")
	one or more of the processors performs the acquiring and the selecting and one or more of the processors performs the processing.(Guan See FIG. 1).
	
	 Regarding claims 13-16, claims 13-16 are directed towards the method performed by the device of claims 1-4, respectively.  Therefore, the rejection applied to claims 1-4 also applies to claims 13-16.
	
	 Regarding claim 19, Guan teaches The method of claim 16, using the patch processing tiling scheme includes determining one or more of: a size and/or shape of the patches; and an overlap of each patch with neighboring patches.(Guan [p. 156 §IV] "Firstly, we need to turn the input features from a 3-D array into a 2-D array that we can calculate as a matrix. To get a single feature in an output channel, we need to convolve a 3-D cube of input features (also known as a patch) with the corresponding convolution kernels. So we take each one of these input patches and flatten them into a single row of input matrix").
	
	 Regarding claims 20-23, claims 20-23 are directed towards the method performed by the device of claims 8-11, respectively.  Therefore, the rejections applied to claims 8-11 also apply to claims 20-23.	

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: 
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.


	Claims 5, 17,  and 18 are rejected under U.S.C. §103 as being unpatentable over the combination of Guan and Ghasemi (US 10572225 B1).

	 Regarding claim 5, Guan teaches The electronic device of claim 4.
	However, Guan doesn't explicitly teach, wherein, for the line buffer processing tiling scheme: the portions of the instances of input data are lines from among a plurality of lines in the instances of input data; and line buffer processing is used for processing sets of one or more lines from the instances of input data.

	Ghasemi, in the same field of endeavor, teaches The electronic device of claim 4, wherein, for the line buffer processing tiling scheme: the portions of the instances of input data are lines from among a plurality of lines in the instances of input data; and line buffer processing is used for processing sets of one or more lines from the instances of input data.([Col. 1 l. 35-60] "A disclosed circuit arrangement includes a request generator circuit that is configured to read data elements of a three-dimensional (3-D) input feature map (IFM) from a memory and store a subset of the data elements in one of a plurality of N line buffers. Each line buffer is configured for storage of M data elements. A pixel iterator circuit is coupled to the line buffers and is configured to generate a sequence of addresses for reading the stored data elements from the line buffers based on a sequence of IFM height values and a sequence of IFM width values. A disclosed method includes storing a three-dimensional (3-D) input feature map (IFM) in a memory and reading a subset of data elements of the 3-D IFM from the memory by a request generator circuit. The subset of data elements is stored in one of a plurality of N line buffers. Each line buffer is configured for storage of M data elements. A pixel iterator circuit generates read requests to the line buffers. The read requests contain a sequence of addresses referencing the stored data elements in the line buffers, and the sequence of addresses is based on a sequence of IFM height values and a sequence of IFM width values. An array of multiply-and-accumulate (MAC) circuits performs multiple consecutive MAC operations on the data elements read from the line buffers and respective elements of a kernel").

	Guan as well as Ghasemi are directed towards reconfigurable FPGAs for processing neural networks.  Therefore, Guan as well as Ghasemi are analogous art in the same field of endeavor.  It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Guan with the teachings of Ghasemi by using a line buffer to process the portions of instances data.  Ghasemi provides as additional motivation for combination ([Col. 17 l. 58-Col. 18 l. 9] "When a packet of data is loaded in the line buffers 108, the request generator circuit 110 passes a token that notifies the application 112 that a packet of data is ready for processing. Subsequently, the application 112 can traverse the line buffers 108 to access the data elements of an IFM sub-volume. The disclosed approaches enable a fluent dataflow while reducing control overhead. The size (M) of the line buffers 108 can be adjusted to improve the balance between the bandwidth of the external memory 102 and the bandwidth of the application 112").

	 Regarding claim 17, claim 17 is directed towards the method performed by the device of claim 5.  Therefore, the rejection applied to claim 5 also applies to claim 17.
	
	 Regarding claim 18, the combination of Guan and Ghasemi teaches The method of claim 17, wherein, for the patch processing tiling scheme: the portions of the instances of input data are patches from among a plurality of patches in the instances of input data; and patch processing is used for processing patches from the instances of input data. (Guan [p. 156 §IV] "Firstly, we need to turn the input features from a 3-D array into a 2-D array that we can calculate as a matrix. To get a single feature in an output channel, we need to convolve a 3-D cube of input features (also known as a patch) with the corresponding convolution kernels. So we take each one of these input patches and flatten them into a single row of input matrix").

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Sharma (“Dnnweaver: From high-level deep network models to fpga acceleration”, 2016) is directed towards a neural network processor tiling scheme method.
Venieris (“fpgaConvNet: A Framework for Mapping  Convolutional Neural Networks on FPGAs”, 2016) is directed towards a method for tiling convolutional neural networks on FPGA hardware.

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720. The examiner can normally be reached M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SIDNEY VINCENT BOSTWICK/Examiner, Art Unit 2124                                                                                                                                                                                                        
/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124

Read full office action

Prosecution Timeline

Sep 30, 2022

Application Filed

Aug 06, 2025

Non-Final Rejection mailed — §102, §103

Nov 05, 2025

Response Filed

Nov 19, 2025

Final Rejection mailed — §102, §103

Mar 18, 2026

Examiner Interview Summary

Mar 18, 2026

Applicant Interview (Telephonic)

May 18, 2026

Request for Continued Examination

May 20, 2026

Response after Non-Final Action

Precedent Cases

Applications granted by this same examiner with similar technology

17/632,509

Patent 12626139

SECRET SOFTMAX FUNCTION CALCULATION SYSTEM, SECRET SOFTMAX FUNCTION CALCULATION APPARATUS, SECRET SOFTMAX FUNCTION CALCULATION METHOD, SECRET NEURAL NETWORK CALCULATION SYSTEM, SECRET NEURAL NETWORK LEARNING SYSTEM, AND PROGRAM

4y 3m to grant Granted May 12, 2026

18/909,558

Patent 12619815

Magnitude Invariant Multimodal Agent for Efficient Image-Text Interface Automation

1y 6m to grant Granted May 05, 2026

17/373,021

Patent 12561604

SYSTEM AND METHOD FOR ITERATIVE DATA CLUSTERING USING MACHINE LEARNING

4y 7m to grant Granted Feb 24, 2026

18/486,534

Patent 12547878

Highly Efficient Convolutional Neural Networks

2y 4m to grant Granted Feb 10, 2026

16/902,547

Patent 12536426

Smooth Continuous Piecewise Constructed Activation Functions

5y 7m to grant Granted Jan 27, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4

Expected OA Rounds

51%

Grant Probability

89%

With Interview (+38.0%)

4y 5m (~9m remaining)

Median Time to Grant

Moderate

PTA Risk

Based on 138 resolved cases by this examiner. Grant probability derived from career allowance rate.