Last updated: April 19, 2026
Application No. 17/666,829
OPERATION PROCESSING APPARATUS

Non-Final OA §103§112
Filed
Feb 08, 2022
Examiner
VICARY, KEITH E
Art Unit
2183
Tech Center
2100 — Computer Architecture & Software
Assignee
Inter-University Research Institute Corporation Research Organization Of Information And Systems
OA Round
9 (Non-Final)
Interview Optional

— +41.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 683 resolved cases, 2023–2026
Examiner Intelligence

VICARY, KEITH E View full profile →
Grants 58% of resolved cases
Career Allow Rate
393 granted / 683 resolved
+2.5% vs TC avg
Strong +41% interview lift
Without
With
+41.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 8m
Avg Prosecution
41 currently pending
Career history
724
Total Applications
across all art units
Statute-Specific Performance

§101
8.7%
-31.3% vs TC avg
§103
34.0%
-6.0% vs TC avg
§102
12.0%
-28.0% vs TC avg
§112
37.6%
-2.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 683 resolved cases
Office Action

§103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on November 10, 2025, has been entered.
 
Claims 1-20 are pending in this office action and presented for examination. Claims 1, 5-8, and 17 are newly amended by the RCE received November 10, 2025.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 recites the limitation “the sections” in line 18. However, there is insufficient antecedent basis for this limitation in the claims. Note that this limitation is also recited in claim 1, line 19 (a first instance); claim 1, line 19 (a second instance); claim 1, line 20; claim 2, line 3; and claim 19, line 3.
Claim 1 recites the limitation “the sections” in line 20. However, it is indefinite as to whether the antecedent basis for this limitation is “three or more pipeline sections separated by two or more buffers including a plurality of entries” of claim 1, lines 15-16, or “zero or more, and less than all, and not a most-downstream, of the sections” of claim 1, line 18.
Claim 1 recites the limitation “the buffers” in line 21. However, there is insufficient antecedent basis for this limitation in the claims. Note that this limitation is also recited in claim 3, line 2; claim 4, line 2; claim 5, line 2 (first instance); claim 5, line 2 (second instance); claim 5, line 3; claim 6, line 2 (first instance); claim 6, line 2 (second instance); claim 6, line 3; claim 7, line 2 (first instance); claim 7, line 2 (second instance); claim 7, line 3; claim 13, line 5; claim 14, line 5; claim 15, line 5; and claim 16, line 5. Note that, given that “the buffers” has insufficient antecedent basis, “the entries of the buffers” in claim 5, line 2, consequently has insufficient antecedent basis. 
Claims 2-16 and 18-20 are rejected for failing to alleviate the rejections of claim 1 above.

Claim 5 recites the limitation “pipeline sections” in lines 2-3. However, it is indefinite as to whether these pipeline sections are the same as, or different from, “three or more pipeline sections” as recited in claim 1, line 15.
Claim 12 is rejected for failing to alleviate the rejection of claim 5 above.

Claim 6 recites the limitation “pipeline sections” in line 2. However, it is indefinite as to whether these pipeline sections are the same as, or different from, “three or more pipeline sections” as recited in claim 1, line 15.
Claim 6 recites the limitation “and the at least one buffer disposed immediately upstream of the execution units, suspend processing of an element operation among the two or more element operations until the source operands needed for execution of the element operation are collected, upstream being a direction in which the element operation proceeds from” in lines 5-9. However, the metes and bounds of this limitation are indefinite. For example, it is indefinite as to whether the claim is conveying that the at least one buffer suspends processing of an element operation, in view of the inserted comma.
Claim 6 recites the limitation “the source operands needed for execution of the element operation” in line 8. However, there is insufficient antecedent basis for this limitation in the claims. 

Claim 7 recites the limitation “pipeline sections” in line 2. However, it is indefinite as to whether these pipeline sections are the same as, or different from, “three or more pipeline sections” as recited in claim 1, line 15.
Claim 7 recites the limitation “and the at least one buffer disposed immediately upstream of the execution units, suspend processing of an element operation among the two or more element operations until the source operands needed for execution of the element operation are collected, upstream being a direction in which the element operation proceeds from” in lines 5-9. However, the metes and bounds of this limitation are indefinite. For example, it is indefinite as to whether the claim is conveying that the at least one buffer suspends processing of an element operation, in view of the inserted comma.
Claim 7 recites the limitation “the source operands needed for execution of the element operation” in line 8. However, there is insufficient antecedent basis for this limitation in the claims. 

Claim 8 recites the limitation “the backend pipeline is further configured to perform bypassing of an execution result between a producer element operation and a consumer element operation, the producer and consumer element operations being generated from the micro-operation or different micro-operations whether or not a positional relationship of the producer element operation and the consumer element operation is changed due to stopping of sections with tags that uniquely specify execution results” in lines 1-6. However, it is indefinite as to whether the “whether or not” limitation is intended to further limit the “being generated” limitation or the “perform bypassing” limitation. 

Claim 17 recites the limitation “the sections” in line 18. However, there is insufficient antecedent basis for this limitation in the claims. Note that this limitation is also recited in claim 17, line 19 (a first instance); claim 17, line 19 (a second instance); claim 17, line 20; and claim 17, line 29.
Claim 17 recites the limitation “the sections” in line 20. However, it is indefinite as to whether the antecedent basis for this limitation is “three or more pipeline sections separated by two or more buffers including a plurality of entries” of claim 17, lines 15-16, or “zero or more, and less than all, and not a most-downstream, of the sections” of claim 17, line 18.
Claim 17 recites the limitation “the buffers” in line 21. However, there is insufficient antecedent basis for this limitation in the claims. 
Claim 17 recites the limitation “the execution result” in line 27. However, there is insufficient antecedent basis for this limitation in the claims.
Claim 17 recites the limitation “the sections” in line 29. However, it is indefinite as to whether the antecedent basis for this limitation is “three or more pipeline sections separated by two or more buffers including a plurality of entries” of claim 17, lines 15-16, or “zero or more, and less than all, and not a most-downstream, of the sections” of claim 17, line 18.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-16, 18, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bhamidipati et al. (Bhamidipati) (US 6112295) and Zaidi et al. (Zaidi) (US 6065105) in view of Potter et al. (Potter) (US 9600288 B1).
Consider claim 1, Bhamidipati discloses a processor core (col. 2, line 32, processor) comprising: a frontend pipeline that fetches an instruction and generates a micro-operation (col. 2, lines 45-46, instruction fetch 100 (referred to as F), instruction decode 102 (referred to as D)) and stores the generated micro-operation in an element operation issuing circuit (col. 2, lines 46-48, operand address computation 104 (referred to as A), and execution/operand fetch/operand store 106 (referred to X); col. 3, lines 54-64, it should be emphasized that although specific pipe stages are used to describe the present invention, the invention may be incorporated in other pipe stages without exceeding its scope. For example, decoupling queue 306 may be inserted between pipe stage D 102 and pipe stage A 104. Moreover, it should be obvious to one of ordinary skill in the art to apply the present invention to many more pipe stages than the four illustrated in FIG. 1, or to multiple pipelines within a processor or to multiple processors within an illustrative computer system in FIG. 5 without departing from the scope of the present invention; col. 4, lines 35-36, data storage unit 404 with five entries; col. 1, line 39, as a processor further splits up its pipe stages; in other words, a decoupling queue following pipe stage D, or a decoupling queue following pipe stage A, for example, may map to the recited element operation issuing unit); and a backend pipeline (col. 2, lines 46-48, operand address computation 104 (referred to as A), and execution/operand fetch/operand store 106 (referred to X); col. 3, lines 54-64, it should be emphasized that although specific pipe stages are used to describe the present invention, the invention may be incorporated in other pipe stages without exceeding its scope. For example, decoupling queue 306 may be inserted between pipe stage D 102 and pipe stage A 104. Moreover, it should be obvious to one of ordinary skill in the art to apply the present invention to many more pipe stages than the four illustrated in FIG. 1, or to multiple pipelines within a processor or to multiple processors within an illustrative computer system in FIG. 5 without departing from the scope of the present invention; col. 4, lines 35-36, data storage unit 404 with five entries; col. 1, line 39, as a processor further splits up its pipe stages; in other words, the backend pipeline includes pipe stage X, or both pipe stage A and pipe stage X, depending on whether the element operation issuing unit is considered to be a decoupling queue following pipe stage D, or a decoupling queue following pipe stage A, for example), wherein the element operation issuing circuit is configured to: receive the generated micro-operation from the frontend pipeline; schedule the generated micro-operation; and issue an element operation to a lane of the backend pipeline (col. 2, lines 45-46, instruction fetch 100 (referred to as F), instruction decode 102 (referred to as D); col. 2, lines 46-48, operand address computation 104 (referred to as A), and execution/operand fetch/operand store 106 (referred to X); col. 3, lines 54-64, it should be emphasized that although specific pipe stages are used to describe the present invention, the invention may be incorporated in other pipe stages without exceeding its scope. For example, decoupling queue 306 may be inserted between pipe stage D 102 and pipe stage A 104. Moreover, it should be obvious to one of ordinary skill in the art to apply the present invention to many more pipe stages than the four illustrated in FIG. 1, or to multiple pipelines within a processor or to multiple processors within an illustrative computer system in FIG. 5 without departing from the scope of the present invention; col. 4, lines 35-36, data storage unit 404 with five entries; col. 1, line 39, as a processor further splits up its pipe stages; in other words, a decoupling queue following pipe stage D, or a decoupling queue following pipe stage A, for example, receives a micro-operation, schedules the micro-operation for when the following stage can accept the micro-operation, and issues an element operation onwards), the backend pipeline is configured to: receive the element operation from the element operation issuing circuit; and process the element operation in the respective lane (col. 2, lines 46-48, operand address computation 104 (referred to as A), and execution/operand fetch/operand store 106 (referred to X); col. 3, lines 54-64, it should be emphasized that although specific pipe stages are used to describe the present invention, the invention may be incorporated in other pipe stages without exceeding its scope. For example, decoupling queue 306 may be inserted between pipe stage D 102 and pipe stage A 104. Moreover, it should be obvious to one of ordinary skill in the art to apply the present invention to many more pipe stages than the four illustrated in FIG. 1, or to multiple pipelines within a processor or to multiple processors within an illustrative computer system in FIG. 5 without departing from the scope of the present invention; col. 4, lines 35-36, data storage unit 404 with five entries; col. 1, line 39, as a processor further splits up its pipe stages; in other words, the backend pipeline includes pipe stage X, or both pipe stage A and pipe stage X, depending on whether the element operation issuing unit is considered to be a decoupling queue following pipe stage D, or a decoupling queue following pipe stage A, for example), the backend pipeline comprises three or more pipeline sections separated by two or more buffers including a plurality of entries (col. 3, lines 54-64, it should be emphasized that although specific pipe stages are used to describe the present invention, the invention may be incorporated in other pipe stages without exceeding its scope. For example, decoupling queue 306 may be inserted between pipe stage D 102 and pipe stage A 104. Moreover, it should be obvious to one of ordinary skill in the art to apply the present invention to many more pipe stages than the four illustrated in FIG. 1, or to multiple pipelines within a processor or to multiple processors within an illustrative computer system in FIG. 5 without departing from the scope of the present invention; col. 4, lines 35-36, data storage unit 404 with five entries; col. 1, line 39, as a processor further splits up its pipe stages), and when zero or more, and less than all, and not a most-downstream, of the sections stop processing, a remainder of the sections not including the most-downstream of the sections continue processing by storing element operations proceeding to a downstream one of the sections into at least one of the buffers immediately downstream thereof respectively, downstream being a direction in which the element operations proceed to (col. 4, lines 20-21, when external input 406 indicates a stall in the previous pipe stage; col. 1, lines 30-35, instead of waiting for the output of the fetch stage, the processor can obtain instructions directly from the queue and proceed with its decode stage. As a result, the execution of the fetch stage and the decode stage have been decoupled. In other words, the two stages can carry our their own tasks independently; col. 1, lines 16-20, this technique is known as pipelining. Each step in the pipeline, or a pipe stage, completes a part of an instruction. The pipe stages are connected one to the next to form a pipe, where instructions enter at one end, are processed through the stages, and exit at the other end; col. 3, lines 54-64, it should be emphasized that although specific pipe stages are used to describe the present invention, the invention may be incorporated in other pipe stages without exceeding its scope. For example, decoupling queue 306 may be inserted between pipe stage D 102 and pipe stage A 104. Moreover, it should be obvious to one of ordinary skill in the art to apply the present invention to many more pipe stages than the four illustrated in FIG. 1, or to multiple pipelines within a processor or to multiple processors within an illustrative computer system in FIG. 5 without departing from the scope of the present invention; col. 4, lines 35-36, data storage unit 404 with five entries; col. 1, line 39, as a processor further splits up its pipe stages).
To any extent to which Bhamidipati does not implicitly or inherently disclose a micro-operation, storing the generated micro-operation in the element operation issuing circuit, wherein the element operation issuing circuit is configured to: receive the generated micro-operation from the frontend pipeline; schedule the generated micro-operation; and issue an element operation to a lane of the backend pipeline, the backend pipeline configured to receive the element operation from the element operation issuing circuit; and process the element operation in the lane, Zaidi discloses a micro-operation, storing the generated micro-operation in an element operation issuing circuit, wherein the element operation issuing circuit is configured to: receive the generated micro-operation from a frontend pipeline; schedule the generated micro-operation; and issue an element operation to a lane of a backend pipeline, the backend pipeline configured to receive the element operation from the element operation issuing circuit; and process the element operation in the lane (col. 4, line 1, micro-ops; col. 3, line 61, instruction scheduler 30; col. 4, line 13, micro-op waiting buffer 34; col. 6, line 7, pipeline; col. 7, lines 21-23, prevents a dependent instruction from being issued simultaneously with an instruction on which it depends; col. 9, lines 65-67, delaying the dispatch of the second instruction by a preselected period of time corresponding to a length of time estimated for execution of the first instruction). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Zaidi with the invention of Bhamidipati, as this modification merely entails combining prior art elements (the prior art elements of Bhamidipati, and the well-known processor architecture subject matter of a micro-operation and issuing, as explicitly disclosed by Zaidi) according to known methods (Zaidi’s disclosure reflects the use of a micro-operation and issuing in processor architecture is known) to yield predictable results (the invention of Bhamidipati, entailing a micro-operation, scheduling, and issuing), which is an example of a rationale that may support a conclusion of obviousness as per MPEP 2143. Alternatively, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Zaidi with the invention of Bhamidipati, given that micro-operations facilitate execution of more complex instructions.
However, the combination thus far does not entail that two or more element operations are generated, issued to respective ones of two or more lanes, received, and processed in the respective lanes. 
On the other hand, Potter discloses two or more element operations are generated, issued to respective ones of two or more lanes, received, and processed in the respective lanes (col. 1, lines 30-31, single instruction multiple data (SIMD) parallel micro-architecture; col. 4, lines 32-36, single-instruction-multiple-data (SIMD) cores typically include a large number of computation units for handling data-level parallelism (DLP). The DLP within applications allows a same operation or task to be applied simultaneously on several different pieces of data).
Potter’s teaching exploits parallelism in a data stream (Potter, col. 1, lines 33-34). 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Potter with the combination of Bhamidipati and Zaidi in order to exploit parallelism in a data stream. 

Consider claim 2, the overall combination entails the processor core according to claim 1 (see above). In addition, Zaidi further discloses the element operation issuing circuit is further configured to issue dependent element operations at a timing, the timing at which an execution result can be passed in cases where none of the sections are presumed to stop the processing (col. 6, lines 15-19, also assume that micro-op entry 43 associated with micro-op 31 indicates that micro-op 31 is of the type that will take three clock cycles to execute. In other words, any instructions dependent on micro-op 31 should preferably be scheduled after the next three clock cycles; col. 9, lines 66-67, delaying the dispatch of the second instruction by a preselected period of time corresponding to a length of time estimated for execution of the first instruction).
Zaidi’s further teaching reduces pipeline stalls (Zaidi, col. 6, line 7) and time lag (Zaidi, col. 2, line 11). 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the further teaching of Zaidi with the previously-explained combination of Bhamidipati, Zaidi, and Potter in order to reduce pipeline stalls and time lag.

Consider claim 3, the overall combination entails the processor core according to claim 1 (see above), wherein the buffers are of First In-First Out (FIFO), which prevents the element operations from overtaking in each of the two or more lanes (Bhamidipati, col. 4, lines 39-44, as incoming data 412 enters data storage unit 404, WP 802 is incremented in step 702. Similarly, as outgoing data 414 leaves data storage unit 404, RP 800 is also incremented in step 702. When either pointer reaches the end of data storage 404, or entry 5, the pointer is wrapped around or begins from entry 1 again; col. 4, lines 51-55, when RP 800 and WP 802 point to the same entry 2 in step 704, and RP 800 is verified to have wrapped around in step 706, WR mode continues in step 710. In other words, write operations must take place before any further read operations can occur; col. 4, lines 63-66, more particularly, read operations must occur prior to any subsequent write operations, because data storage unit 404 cannot accept any additional incoming data 412; Potter, col. 1, lines 30-31, single instruction multiple data (SIMD) parallel micro-architecture; col. 4, lines 32-36, single-instruction-multiple-data (SIMD) cores typically include a large number of computation units for handling data-level parallelism (DLP). The DLP within applications allows a same operation or task to be applied simultaneously on several different pieces of data).

Consider claim 4, the overall combination entails the processor core according to claim 2 (see above), wherein the buffers are of First In-First Out (FIFO), which prevents the element operations from overtaking in each of the two or more lanes (Bhamidipati, col. 4, lines 39-44, as incoming data 412 enters data storage unit 404, WP 802 is incremented in step 702. Similarly, as outgoing data 414 leaves data storage unit 404, RP 800 is also incremented in step 702. When either pointer reaches the end of data storage 404, or entry 5, the pointer is wrapped around or begins from entry 1 again; col. 4, lines 51-55, when RP 800 and WP 802 point to the same entry 2 in step 704, and RP 800 is verified to have wrapped around in step 706, WR mode continues in step 710. In other words, write operations must take place before any further read operations can occur; col. 4, lines 63-66, more particularly, read operations must occur prior to any subsequent write operations, because data storage unit 404 cannot accept any additional incoming data 412; Potter, col. 1, lines 30-31, single instruction multiple data (SIMD) parallel micro-architecture; col. 4, lines 32-36, single-instruction-multiple-data (SIMD) cores typically include a large number of computation units for handling data-level parallelism (DLP). The DLP within applications allows a same operation or task to be applied simultaneously on several different pieces of data).

Consider claim 5, the combination thus far entails the processor core according to claim 1 (see above), wherein all or some of the entries of the buffers, the buffers being disposed between pipeline sections and including at least one buffer among the buffers that is disposed immediately upstream of execution units that execute the element operations, have a function, and the at least one buffer disposed immediately upstream of the execution units holds an element operation among the two or more element operations, upstream being a direction in which the element operation proceeds from (Bhamidipati, col. 4, lines 20-21, when external input 406 indicates a stall in the previous pipe stage; col. 1, lines 30-35, instead of waiting for the output of the fetch stage, the processor can obtain instructions directly from the queue and proceed with its decode stage. As a result, the execution of the fetch stage and the decode stage have been decoupled. In other words, the two stages can carry our their own tasks independently; col. 1, lines 16-20, this technique is known as pipelining. Each step in the pipeline, or a pipe stage, completes a part of an instruction. The pipe stages are connected one to the next to form a pipe, where instructions enter at one end, are processed through the stages, and exit at the other end; Potter, col. 1, lines 30-31, single instruction multiple data (SIMD) parallel micro-architecture; col. 4, lines 32-36, single-instruction-multiple-data (SIMD) cores typically include a large number of computation units for handling data-level parallelism (DLP). The DLP within applications allows a same operation or task to be applied simultaneously on several different pieces of data).
In addition, Zaidi further explicitly discloses receiving execution results, and holding an element operation until source operands needed for execution of the element operation are collected (col. 6, lines 15-19, also assume that micro-op entry 43 associated with micro-op 31 indicates that micro-op 31 is of the type that will take three clock cycles to execute. In other words, any instructions dependent on micro-op 31 should preferably be scheduled after the next three clock cycles; col. 9, lines 66-67, delaying the dispatch of the second instruction by a preselected period of time corresponding to a length of time estimated for execution of the first instruction; col. 1, lines 26-37, another factor that affects whether an instruction is ready for execution is whether the instruction's sources are available. An instruction's sources are the data that the instruction requires before it can be executed. An instruction is said to be dependent on earlier instruction when it cannot be executed until the earlier instruction has been executed. An example of this is when a first instruction calculates or stores results that are to be utilized by a later instruction. In this case, the later instruction cannot be scheduled for execution until the first instruction has executed. This dependency of a later instruction on data derived from an earlier instruction is commonly referred to as data dependency).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the further teaching of Zaidi with the previously-explained combination of Bhamidipati, Zaidi, and Potter in order to ensure correct program execution by accounting for data dependencies. 
However, the combination thus far does not explicitly entail receiving execution results from a bypass.
On the other hand, Potter further discloses receiving execution results from a bypass (col. 10, lines 55-56, the datapath 300 supports bypass of a result that is consumed by a younger instruction in-program-order).
Potter’s teaching saves energy and reduces a number of dynamic stalls (Potter, col. 5, lines 13-23).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the further teaching of Potter with the previously-explained combination of Bhamidipati, Zaidi, and Potter in order to save energy and reduce a number of dynamic stalls. 

Consider claim 6, the combination thus far entails the processor core according to claim 2 (see above), wherein all or some of entries of the buffers, the buffers being disposed between pipeline sections and including at least one buffer among the buffers that is disposed immediately upstream of execution units that execute the element operations, have a function, and the at least one buffer disposed immediately upstream of the execution units, suspend processing of an element operation among the two or more element operations, upstream being a direction in which the element operation proceeds from (Bhamidipati, col. 4, lines 20-21, when external input 406 indicates a stall in the previous pipe stage; col. 1, lines 30-35, instead of waiting for the output of the fetch stage, the processor can obtain instructions directly from the queue and proceed with its decode stage. As a result, the execution of the fetch stage and the decode stage have been decoupled. In other words, the two stages can carry our their own tasks independently; col. 1, lines 16-20, this technique is known as pipelining. Each step in the pipeline, or a pipe stage, completes a part of an instruction. The pipe stages are connected one to the next to form a pipe, where instructions enter at one end, are processed through the stages, and exit at the other end; Potter, col. 1, lines 30-31, single instruction multiple data (SIMD) parallel micro-architecture; col. 4, lines 32-36, single-instruction-multiple-data (SIMD) cores typically include a large number of computation units for handling data-level parallelism (DLP). The DLP within applications allows a same operation or task to be applied simultaneously on several different pieces of data).
In addition, Zaidi additionally explicitly discloses receiving source operands, and suspending processing of an element operation until source operands needed for execution of the element operation are collected (col. 6, lines 15-19, also assume that micro-op entry 43 associated with micro-op 31 indicates that micro-op 31 is of the type that will take three clock cycles to execute. In other words, any instructions dependent on micro-op 31 should preferably be scheduled after the next three clock cycles; col. 9, lines 66-67, delaying the dispatch of the second instruction by a preselected period of time corresponding to a length of time estimated for execution of the first instruction; col. 1, lines 26-37, another factor that affects whether an instruction is ready for execution is whether the instruction's sources are available. An instruction's sources are the data that the instruction requires before it can be executed. An instruction is said to be dependent on earlier instruction when it cannot be executed until the earlier instruction has been executed. An example of this is when a first instruction calculates or stores results that are to be utilized by a later instruction. In this case, the later instruction cannot be scheduled for execution until the first instruction has executed. This dependency of a later instruction on data derived from an earlier instruction is commonly referred to as data dependency).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the additional teaching of Zaidi with the previously-explained combination of Bhamidipati, Zaidi, and Potter in order to ensure correct program execution by accounting for data dependencies. 
However, the combination thus far does not explicitly entail receiving source operands from a bypass.
On the other hand, Potter further discloses receiving source operands from a bypass (col. 10, lines 55-56, the datapath 300 supports bypass of a result that is consumed by a younger instruction in-program-order).
Potter’s teaching saves energy and reduces a number of dynamic stalls (Potter, col. 5, lines 13-23).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the further teaching of Potter with the previously-explained combination of Bhamidipati, Zaidi, and Potter in order to save energy and reduce a number of dynamic stalls.

Consider claim 7, the combination thus far entails the processor core according to claim 3 (see above), wherein all or some of entries of the buffers, the buffers being disposed between pipeline sections and including at least one buffer among the buffers disposed immediately upstream of execution units that execute the element operations, have a function, and the at least one buffer disposed immediately upstream of the execution units, suspend processing of an element operation among the two or more element operations, upstream being a direction in which the element operation proceeds from (Bhamidipati, col. 4, lines 20-21, when external input 406 indicates a stall in the previous pipe stage; col. 1, lines 30-35, instead of waiting for the output of the fetch stage, the processor can obtain instructions directly from the queue and proceed with its decode stage. As a result, the execution of the fetch stage and the decode stage have been decoupled. In other words, the two stages can carry our their own tasks independently; col. 1, lines 16-20, this technique is known as pipelining. Each step in the pipeline, or a pipe stage, completes a part of an instruction. The pipe stages are connected one to the next to form a pipe, where instructions enter at one end, are processed through the stages, and exit at the other end; Potter, col. 1, lines 30-31, single instruction multiple data (SIMD) parallel micro-architecture; col. 4, lines 32-36, single-instruction-multiple-data (SIMD) cores typically include a large number of computation units for handling data-level parallelism (DLP). The DLP within applications allows a same operation or task to be applied simultaneously on several different pieces of data).
In addition, Zaidi further explicitly discloses receiving source operands, and suspending processing of an element operation until source operands needed for execution of the element operation are collected (col. 6, lines 15-19, also assume that micro-op entry 43 associated with micro-op 31 indicates that micro-op 31 is of the type that will take three clock cycles to execute. In other words, any instructions dependent on micro-op 31 should preferably be scheduled after the next three clock cycles; col. 9, lines 66-67, delaying the dispatch of the second instruction by a preselected period of time corresponding to a length of time estimated for execution of the first instruction; col. 1, lines 26-37, another factor that affects whether an instruction is ready for execution is whether the instruction's sources are available. An instruction's sources are the data that the instruction requires before it can be executed. An instruction is said to be dependent on earlier instruction when it cannot be executed until the earlier instruction has been executed. An example of this is when a first instruction calculates or stores results that are to be utilized by a later instruction. In this case, the later instruction cannot be scheduled for execution until the first instruction has executed. This dependency of a later instruction on data derived from an earlier instruction is commonly referred to as data dependency).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the further teaching of Zaidi with the previously-explained combination of Bhamidipati, Zaidi, and Potter in order to ensure correct program execution by accounting for data dependencies. 
However, the combination thus far does not explicitly entail receiving source operands from a bypass.
On the other hand, Potter further discloses receiving source operands from a bypass (col. 10, lines 55-56, the datapath 300 supports bypass of a result that is consumed by a younger instruction in-program-order).
Potter’s teaching saves energy and reduces a number of dynamic stalls (Potter, col. 5, lines 13-23).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the further teaching of Potter with the previously-explained combination of Bhamidipati, Zaidi, and Potter in order to save energy and reduce a number of dynamic stalls.

Consider claim 8, the combination thus far entails the processor core according to claim 1 (see above), and a positional relationship of operations being generated from the micro-operation or different micro-operations changing or not changing due to stopping of sections (Bhamidipati, col. 4, lines 20-21, when external input 406 indicates a stall in the previous pipe stage; col. 1, lines 30-35, instead of waiting for the output of the fetch stage, the processor can obtain instructions directly from the queue and proceed with its decode stage. As a result, the execution of the fetch stage and the decode stage have been decoupled. In other words, the two stages can carry our their own tasks independently; col. 1, lines 16-20, this technique is known as pipelining. Each step in the pipeline, or a pipe stage, completes a part of an instruction. The pipe stages are connected one to the next to form a pipe, where instructions enter at one end, are processed through the stages, and exit at the other end) but does not entail the backend pipeline is further configured to perform bypassing of an execution result between a producer element operation and a consumer element operation whether or not a positional relationship of the producer element operation and the consumer element operation is changed due to stopping of sections with tags that uniquely specify execution results, by attaching a tag for a destination operand to an execution result on a side of the producer element operation of a bypass, and by selecting an execution result that is attached with a same tag as a tag for a source operand on a side of the consumer element operation of the bypass.
On the other hand, Potter further discloses a backend pipeline is further configured to perform bypassing of an execution result between a producer element operation and a consumer element operation whether or not a positional relationship of the producer element operation and the consumer element operation is changed due to stopping of sections with tags that uniquely specify execution results, by attaching a tag for a destination operand to an execution result on a side of the producer element operation of a bypass, and by selecting an execution result that is attached with a same tag as a tag for a source operand on a side of the consumer element operation of the bypass (col. 12, lines 17-25, hardware determines data dependencies by comparing particular fields within instructions. These fields may include at least destination/result identifiers (IDs), thread IDs, and so forth. Further, the hardware may compare these fields for an instruction entering the execution pipeline and another instruction completing a given stage of the one or more execution pipeline stages. In one embodiment, the given stage is the last stage, N-1, of the execution pipeline).
Potter’s further teaching saves energy and reduces a number of dynamic stalls (Potter, col. 5, lines 13-23).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the further teaching of Potter with the previously-explained combination of Bhamidipati, Zaidi, and Potter in order to save energy and reduce a number of dynamic stalls.

Consider claim 9, the combination thus far entails the processor core according to claim 2 (see above), and a positional relationship of operations is changed due to stopping of related sections (Bhamidipati, col. 4, lines 20-21, when external input 406 indicates a stall in the previous pipe stage; col. 1, lines 30-35, instead of waiting for the output of the fetch stage, the processor can obtain instructions directly from the queue and proceed with its decode stage. As a result, the execution of the fetch stage and the decode stage have been decoupled. In other words, the two stages can carry our their own tasks independently; col. 1, lines 16-20, this technique is known as pipelining. Each step in the pipeline, or a pipe stage, completes a part of an instruction. The pipe stages are connected one to the next to form a pipe, where instructions enter at one end, are processed through the stages, and exit at the other end), but does not entail the backend pipeline is further configured to bypass another execution result of element operations by attaching a tag that uniquely specifies the another execution result, transmitting the another execution result to a bypass on a sender side of the bypass, and receiving the another execution result by match comparing tags on a receiver side of the bypass.
On the other hand, Potter discloses the backend pipeline is further configured to bypass another execution result of element operations by attaching a tag that uniquely specifies the another execution result, transmitting the another execution result to a bypass on a sender side of the bypass, and receiving the another execution result by match comparing tags on a receiver side of the bypass (col. 12, lines 17-25, hardware determines data dependencies by comparing particular fields within instructions. These fields may include at least destination/result identifiers (IDs), thread IDs, and so forth. Further, the hardware may compare these fields for an instruction entering the execution pipeline and another instruction completing a given stage of the one or more execution pipeline stages. In one embodiment, the given stage is the last stage, N-1, of the execution pipeline).
Potter’s teaching saves energy and reduces a number of dynamic stalls (Potter, col. 5, lines 13-23).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the further teaching of Potter with the previously-explained combination of Bhamidipati, Zaidi, and Potter in order to save energy and reduce a number of dynamic stalls.

Consider claim 10, the combination thus far entails the processor core according to claim 3 (see above), and a positional relationship of operations is changed due to stopping of related sections (Bhamidipati, col. 4, lines 20-21, when external input 406 indicates a stall in the previous pipe stage; col. 1, lines 30-35, instead of waiting for the output of the fetch stage, the processor can obtain instructions directly from the queue and proceed with its decode stage. As a result, the execution of the fetch stage and the decode stage have been decoupled. In other words, the two stages can carry our their own tasks independently; col. 1, lines 16-20, this technique is known as pipelining. Each step in the pipeline, or a pipe stage, completes a part of an instruction. The pipe stages are connected one to the next to form a pipe, where instructions enter at one end, are processed through the stages, and exit at the other end), but does not entail the backend pipeline is further configured to bypass an execution result of element operations by attaching a tag that uniquely specifies the execution result, transmitting the execution result to a bypass on a sender side of the bypass, and receiving the execution result by match comparing tags on a receiver side of the bypass.
On the other hand, Potter discloses the backend pipeline is further configured to bypass an execution result of element operations by attaching a tag that uniquely specifies the execution result, transmitting the execution result to a bypass on a sender side of the bypass, and receiving the execution result by match comparing tags on a receiver side of the bypass (col. 12, lines 17-25, hardware determines data dependencies by comparing particular fields within instructions. These fields may include at least destination/result identifiers (IDs), thread IDs, and so forth. Further, the hardware may compare these fields for an instruction entering the execution pipeline and another instruction completing a given stage of the one or more execution pipeline stages. In one embodiment, the given stage is the last stage, N-1, of the execution pipeline).
Potter’s teaching saves energy and reduces a number of dynamic stalls (Potter, col. 5, lines 13-23).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the further teaching of Potter with the previously-explained combination of Bhamidipati, Zaidi, and Potter in order to save energy and reduce a number of dynamic stalls.

Consider claim 11, the combination thus far entails the processor core according to claim 4 (see above), and a positional relationship of operations is changed due to stopping of related sections (Bhamidipati, col. 4, lines 20-21, when external input 406 indicates a stall in the previous pipe stage; col. 1, lines 30-35, instead of waiting for the output of the fetch stage, the processor can obtain instructions directly from the queue and proceed with its decode stage. As a result, the execution of the fetch stage and the decode stage have been decoupled. In other words, the two stages can carry our their own tasks independently; col. 1, lines 16-20, this technique is known as pipelining. Each step in the pipeline, or a pipe stage, completes a part of an instruction. The pipe stages are connected one to the next to form a pipe, where instructions enter at one end, are processed through the stages, and exit at the other end), but does not entail the backend pipeline is further configured to bypass another execution result of element operations by attaching a tag that uniquely specifies the another execution result, transmitting the another execution result to a bypass on a sender side of the bypass, and receiving the another execution result by match comparing tags on a receiver side of the bypass.
On the other hand, Potter discloses the backend pipeline is further configured to bypass another execution result of element operations by attaching a tag that uniquely specifies the another execution result, transmitting the another execution result to a bypass on a sender side of the bypass, and receiving the another execution result by match comparing tags on a receiver side of the bypass (col. 12, lines 17-25, hardware determines data dependencies by comparing particular fields within instructions. These fields may include at least destination/result identifiers (IDs), thread IDs, and so forth. Further, the hardware may compare these fields for an instruction entering the execution pipeline and another instruction completing a given stage of the one or more execution pipeline stages. In one embodiment, the given stage is the last stage, N-1, of the execution pipeline).
Potter’s teaching saves energy and reduces a number of dynamic stalls (Potter, col. 5, lines 13-23).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the further teaching of Potter with the previously-explained combination of Bhamidipati, Zaidi, and Potter in order to save energy and reduce a number of dynamic stalls.

Consider claim 12, the combination thus far entails the processor core according to claim 5 (see above), wherein the backend pipeline is further configured to bypass an execution result of element operations between which a positional relationship is changed due to stopping of related sections by attaching a tag that uniquely specifies the execution result, transmitting the execution result to a bypass on a sender side of the bypass, and receiving the execution result by match comparing tags on a receiver side of the bypass (Potter, col. 12, lines 17-25, hardware determines data dependencies by comparing particular fields within instructions. These fields may include at least destination/result identifiers (IDs), thread IDs, and so forth. Further, the hardware may compare these fields for an instruction entering the execution pipeline and another instruction completing a given stage of the one or more execution pipeline stages. In one embodiment, the given stage is the last stage, N-1, of the execution pipeline; Bhamidipati, col. 4, lines 20-21, when external input 406 indicates a stall in the previous pipe stage; col. 1, lines 30-35, instead of waiting for the output of the fetch stage, the processor can obtain instructions directly from the queue and proceed with its decode stage. As a result, the execution of the fetch stage and the decode stage have been decoupled. In other words, the two stages can carry our their own tasks independently; col. 1, lines 16-20, this technique is known as pipelining. Each step in the pipeline, or a pipe stage, completes a part of an instruction. The pipe stages are connected one to the next to form a pipe, where instructions enter at one end, are processed through the stages, and exit at the other end.

Consider claim 13, the combination thus far entails the processor core according to claim 1 (see above), and a positional relationship of operations is changed or not changed due to stopping of sections (Bhamidipati, col. 4, lines 20-21, when external input 406 indicates a stall in the previous pipe stage; col. 1, lines 30-35, instead of waiting for the output of the fetch stage, the processor can obtain instructions directly from the queue and proceed with its decode stage. As a result, the execution of the fetch stage and the decode stage have been decoupled. In other words, the two stages can carry our their own tasks independently; col. 1, lines 16-20, this technique is known as pipelining. Each step in the pipeline, or a pipe stage, completes a part of an instruction. The pipe stages are connected one to the next to form a pipe, where instructions enter at one end, are processed through the stages, and exit at the other end), but does not entail the backend pipeline is further configured to perform bypassing of an execution result between a producer element operation and a consumer element operation by tracking locations, the locations each of which is either an entry of the buffers or an entry of one of pipeline registers, in which the producer element operation and the consumer element operation are stored.
On the other hand, Potter further discloses a backend pipeline is further configured to perform bypassing of an execution result between a producer element operation and a consumer element operation by tracking locations, the locations each of which is either an entry of the buffers or an entry of one of pipeline registers, in which the producer element operation and the consumer element operation are stored (see FIG. 3, for example, which shows bypassing occurring when the producer element operation and the consumer element operation are in particular stages).
Potter’s further teaching saves energy and reduces a number of dynamic stalls (Potter, col. 5, lines 13-23).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the further teaching of Potter with the previously-explained combination of Bhamidipati, Zaidi, and Potter in order to save energy and reduce a number of dynamic stalls.

Consider claim 14, the combination thus far entails the processor core according to claim 2 (see above), and a positional relationship of operations is changed or not changed due to stopping of sections (Bhamidipati, col. 4, lines 20-21, when external input 406 indicates a stall in the previous pipe stage; col. 1, lines 30-35, instead of waiting for the output of the fetch stage, the processor can obtain instructions directly from the queue and proceed with its decode stage. As a result, the execution of the fetch stage and the decode stage have been decoupled. In other words, the two stages can carry our their own tasks independently; col. 1, lines 16-20, this technique is known as pipelining. Each step in the pipeline, or a pipe stage, completes a part of an instruction. The pipe stages are connected one to the next to form a pipe, where instructions enter at one end, are processed through the stages, and exit at the other end), but does not entail the backend pipeline is further configured to perform bypassing of an execution result between a producer element operation and a consumer element operation by tracking locations, the locations each of which is either an entry of the buffers or an entry of one of pipeline registers, in which the producer element operation and the consumer element operation are stored.
On the other hand, Potter discloses a backend pipeline is further configured to perform bypassing of an execution result between a producer element operation and a consumer element operation by tracking locations, the locations each of which is either an entry of the buffers or an entry of one of pipeline registers, in which the producer element operation and the consumer element operation are stored (see FIG. 3, for example, which shows bypassing occurring when the producer element operation and the consumer element operation are in particular stages).
Potter’s teaching saves energy and reduces a number of dynamic stalls (Potter, col. 5, lines 13-23).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the further teaching of Potter with the previously-explained combination of Bhamidipati, Zaidi, and Potter in order to save energy and reduce a number of dynamic stalls.

Consider claim 15, the combination thus far entails the processor core according to claim 3 (see above), and a positional relationship of operations is changed or not changed due to stopping of sections (Bhamidipati, col. 4, lines 20-21, when external input 406 indicates a stall in the previous pipe stage; col. 1, lines 30-35, instead of waiting for the output of the fetch stage, the processor can obtain instructions directly from the queue and proceed with its decode stage. As a result, the execution of the fetch stage and the decode stage have been decoupled. In other words, the two stages can carry our their own tasks independently; col. 1, lines 16-20, this technique is known as pipelining. Each step in the pipeline, or a pipe stage, completes a part of an instruction. The pipe stages are connected one to the next to form a pipe, where instructions enter at one end, are processed through the stages, and exit at the other end), but does not entail the backend pipeline is further configured to perform bypassing of an execution result between a producer element operation and a consumer element operation by tracking locations, the locations each of which is either an entry of the buffers or an entry of one of pipeline registers, in which the producer element operation and the consumer element operation are stored.
On the other hand, Potter discloses a backend pipeline is further configured to perform bypassing of an execution result between a producer element operation and a consumer element operation by tracking locations, the locations each of which is either an entry of the buffers or an entry of one of pipeline registers, in which the producer element operation and the consumer element operation are stored (see FIG. 3, for example, which shows bypassing occurring when the producer element operation and the consumer element operation are in particular stages).
Potter’s teaching saves energy and reduces a number of dynamic stalls (Potter, col. 5, lines 13-23).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the further teaching of Potter with the previously-explained combination of Bhamidipati, Zaidi, and Potter in order to save energy and reduce a number of dynamic stalls.

Consider claim 16, the combination thus far entails the processor core according to claim 4 (see above), and a positional relationship of operations is changed or not changed due to stopping of sections (Bhamidipati, col. 4, lines 20-21, when external input 406 indicates a stall in the previous pipe stage; col. 1, lines 30-35, instead of waiting for the output of the fetch stage, the processor can obtain instructions directly from the queue and proceed with its decode stage. As a result, the execution of the fetch stage and the decode stage have been decoupled. In other words, the two stages can carry our their own tasks independently; col. 1, lines 16-20, this technique is known as pipelining. Each step in the pipeline, or a pipe stage, completes a part of an instruction. The pipe stages are connected one to the next to form a pipe, where instructions enter at one end, are processed through the stages, and exit at the other end), but does not entail the backend pipeline is further configured to perform bypassing of an execution result between a producer element operation and a consumer element operation by tracking locations, the locations each of which is either an entry of the buffers or an entry of one of pipeline registers, in which the producer element operation and the consumer element operation are stored.
On the other hand, Potter discloses a backend pipeline is further configured to perform bypassing of an execution result between a producer element operation and a consumer element operation by tracking locations, the locations each of which is either an entry of the buffers or an entry of one of pipeline registers, in which the producer element operation and the consumer element operation are stored (see FIG. 3, for example, which shows bypassing occurring when the producer element operation and the consumer element operation are in particular stages).
Potter’s teaching saves energy and reduces a number of dynamic stalls (Potter, col. 5, lines 13-23).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the further teaching of Potter with the previously-explained combination of Bhamidipati, Zaidi, and Potter in order to save energy and reduce a number of dynamic stalls.

Consider claim 18, the overall combination entails the processor core according to claim 1 (see above), wherein a bypass to an entry is physically omitted in a case where receipt of an execution result at a downstream entry is ensured (Bhamidipati, FIG. 3, which does not disclose a bypass, despite this embodiment necessarily entailing an execution result being received by that which stores or further processes the execution result).

Consider claim 20, the overall combination entails the processor core according to claim 1 (see above), wherein all or some of the two or more lanes each handle an element operation of a Single Instruction/Multiple Data stream (SIMD) instruction (Potter, col. 1, lines 30-31, single instruction multiple data (SIMD) parallel micro-architecture; col. 4, lines 32-36, single-instruction-multiple-data (SIMD) cores typically include a large number of computation units for handling data-level parallelism (DLP). The DLP within applications allows a same operation or task to be applied simultaneously on several different pieces of data)

Claim(s) 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bhamidipati, Zaidi, and Potter as applied to claim 1 above, and further in view of Olson et al. (Olson) (US 20150058572).
Consider claim 19, the combination thus far entails the processor core according to claim 1 (see above), but does not explicitly entail a register file and a level-one cache, wherein one or both of the register file and the level-one cache have multibank configurations, and a bank conflict in the multibank configurations is one of causes of stopping of the sections.
On the other hand, Olson discloses a register file ([0038], lines 16-17, multi-banked register file) and a level-one cache ([0041], line 5, L1 cache), wherein one or both of the register file and the level-one cache have multibank configurations ([0038], line 17, multi-banked register file), and a bank conflict in the multibank configurations is one of causes of stopping of sections ([0038], lines 16-17, bank conflict stalls may occur even in a multi-banked register file; [0114], lines 8-11, this scenario is referred to as a "bank conflict" in various embodiments, and may slow overall execution because an instruction may need to stall until all operands are read, for example).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Olson with the combination of Bhamidipati, Zaidi, and Potter in view of the reduced cost of a multibank configuration relative to a large register file structure (see Olson, [0003], lines 10-14), while still ensuring correct program execution via stalling when a bank conflict does occur.

Response to Arguments
Applicant on page 10 argues: “Claims 1-20 are rejected under 35 U.S.C. §§ 112(a) and (b). These rejections are now moot in view of the above claim amendments. Accordingly, withdrawal of these rejections is respectfully requested.”
All previously presented rejections of the claims under 35 U.S.C. §112(a) and various previously presented rejections of the claims under 35 U.S.C. §112(b) are withdrawn in view of the amendments to the claims.  However, other previously presented rejections of the claims under 35 U.S.C. §112(b) appear to remain applicable, and in various cases the amendments to the claims introduce additional issues under 35 U.S.C. §112(b) — see the Claim Rejections - 35 USC § 112 section above.

Applicant on page 12 argues: “As reflected in the above-noted features, amended claim 1 recites issuing two or more element operations generated from the one micro-operation. With the above claimed features, including the last paragraph's features, the plurality of element operations of one micro-operation can proceed and otherwise stop with a continue/stall of a section of the backend pipeline. Such features contribute to the ease of application of the present invention to multiple processors/ multiple pipeline configurations, meaning that the present invention can be applied individually to the internal pipeline of multiple processors, such as superscalar processor (core) or SIMD targets, and does not mean instruction-level parallelism within a single processor/single pipeline. In contrast, Bhamidipati and Zaidi do not address a superscalar processor (core) nor a SIMD pipeline.”
In view of the aforementioned amendments, Examiner is newly relying upon the Potter reference to reject claim 1.

Applicant on page 12 argues: “In col. 3, lines 57-58, the decoupling queue can be placed between decode D and operand calculate A, between D and A, upstream of the backend pipeline. Therefore, Bhamidipati does not teach buffers in backend pipeline.”
However, Examiner submits that Bhamidipati teaches buffers in a backend pipeline, in view of the particular portions of Bhamidipati cited by Examiner above.

Applicant on page 13 argues: “Furthermore, the cited references do not address a SIM\D pipeline.”
In view of the aforementioned amendments, Examiner is newly relying upon the Potter reference to reject claim 1.

Applicant on page 13 argues: “Furthermore, when a buffer is inserted in the backend, the difference in the number of buffered cycles between producer instructions and consumer instructions will result in a deviation from the positional relationship determined at the time of scheduling. Accordingly, the cited references that do not disclose bypass control, and their disclosed functionality with a buffer somehow inserted in the backend will not operate correctly without the bypass control recited in Claims 16 and 17.”
Examiner generally notes that the aforementioned claims do not appear to recite “bypass control” or “difference in the number of buffered cycles” or determining a positional relationship at the time of scheduling. Examiner further notes that the claims do not appear to mandate such a deviation. 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KEITH E VICARY whose telephone number is (571)270-1314. The examiner can normally be reached Monday to Friday, 9:00 AM to 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta can be reached at (571)270-3995. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KEITH E VICARY/Primary Examiner, Art Unit 2183
Read full office action
Prosecution Timeline

Feb 08, 2022
Application Filed
Mar 13, 2023
Non-Final Rejection — §103, §112
Jun 15, 2023
Response Filed
Jul 05, 2023
Final Rejection — §103, §112
Oct 11, 2023
Response after Non-Final Action
Nov 09, 2023
Request for Continued Examination
Nov 21, 2023
Response after Non-Final Action
Dec 05, 2023
Non-Final Rejection — §103, §112
Mar 01, 2024
Examiner Interview Summary
Mar 01, 2024
Applicant Interview (Telephonic)
Mar 12, 2024
Response Filed
Mar 27, 2024
Final Rejection — §103, §112
Jun 10, 2024
Applicant Interview (Telephonic)
Jun 10, 2024
Examiner Interview Summary
Jun 28, 2024
Response after Non-Final Action
Jul 08, 2024
Response after Non-Final Action
Jul 30, 2024
Request for Continued Examination
Aug 01, 2024
Response after Non-Final Action
Aug 13, 2024
Non-Final Rejection — §103, §112
Nov 15, 2024
Response Filed
Dec 03, 2024
Final Rejection — §103, §112
Mar 06, 2025
Request for Continued Examination
Mar 13, 2025
Response after Non-Final Action
Apr 21, 2025
Non-Final Rejection — §103, §112
Jul 21, 2025
Applicant Interview (Telephonic)
Jul 21, 2025
Examiner Interview Summary
Jul 23, 2025
Response Filed
Aug 06, 2025
Final Rejection — §103, §112
Nov 10, 2025
Request for Continued Examination
Nov 16, 2025
Response after Non-Final Action
Dec 29, 2025
Non-Final Rejection — §103, §112
Mar 26, 2026
Applicant Interview (Telephonic)
Mar 26, 2026
Examiner Interview Summary
Precedent Cases

Applications granted by this same examiner with similar technology

18/213,598
Patent 12602349
HANDLING DYNAMIC TENSOR LENGTHS IN A RECONFIGURABLE PROCESSOR THAT INCLUDES MULTIPLE MEMORY UNITS
2y 5m to grant Granted Apr 14, 2026
17/720,657
Patent 12572360
Cache Preload Operations Using Streaming Engine
2y 5m to grant Granted Mar 10, 2026
18/328,688
Patent 12554507
SYSTEMS AND METHODS FOR PROCESSING FORMATTED DATA IN COMPUTATIONAL STORAGE
2y 5m to grant Granted Feb 17, 2026
18/626,629
Patent 12554494
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS TO REQUEST A HISTORY RESET OF A PROCESSOR CORE
2y 5m to grant Granted Feb 17, 2026
18/739,070
Patent 12547401
Load Instruction Fusion
2y 5m to grant Granted Feb 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

9-10
Expected OA Rounds
58%
Grant Probability
99%
With Interview (+41.2%)
3y 8m
Median Time to Grant
High
PTA Risk
Based on 683 resolved cases by this examiner. Grant probability derived from career allow rate.
OPERATION PROCESSING APPARATUS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email