Last updated: April 19, 2026

Application No. 18/359,774

SYSTEM FOR OPTIMIZING VISION TRANSFORMER BLOCKS

Final Rejection §103§DP

Filed

Jul 26, 2023

Examiner

VARNDELL, ROSS E

Art Unit

2674

Tech Center

2600 — Communications

Assignee

Micron Technology, Inc.

OA Round

2 (Final)

Interview Optional

— +13.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 615 resolved cases, 2023–2026

Examiner Intelligence

VARNDELL, ROSS E View full profile →

Grants 85% — above average

Career Allow Rate

520 granted / 615 resolved

+22.6% vs TC avg

Moderate +13% lift

Without

With

+13.0%

Interview Lift

resolved cases with interview

Typical timeline

2y 4m

Avg Prosecution

28 currently pending

Career history

643

Total Applications

across all art units

Statute-Specific Performance

§101

6.3%

-33.7% vs TC avg

§103

66.9%

+26.9% vs TC avg

§102

6.4%

-33.6% vs TC avg

§112

10.7%

-29.3% vs TC avg

Black line = Tech Center average estimate • Based on career data from 615 resolved cases

Office Action

§103 §DP

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
This office action is in response to the amendment filed 3/2/2023.  Claims 1-20 are pending in this application and have been considered below.  
Applicant’s arguments with respect to claims 1-20 have been considered but are moot in view of new ground(s) of rejection because of the amendments.

	
	
Claim Objections
Claim 1-12, and 20 is objected to because of the following informalities: 
In claims 1 and 20, “to configured the processor to” is grammatically incorrect and potentially indefinite.
Claim 2-12 depend either directly or indirectly from the objection(s) of claim(s) 1, therefore they are also objected.
Appropriate correction is required.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
Claims 1-20 rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-20 of U.S. Patent No. (Not yet published – Application number 18/359,786). Although the claims at issue are not identical, they are not patentably distinct from each other because they both claim the same optimized mobile vision transformer architecture, specifically the sequence of concatenation the local and global representation output in a fusion block and subsequently fusing the original input features with that fusion block output.  The minor variations in scope, such as the depthwise-separable convolution in the independent claims versus the dependent claims, do not constitute a patentable distinction between the two applications.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-3, 6-14, and 16-17, and 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Mehta et al. (MobileViT: Light-Weight, General-Purpose, And Mobile-Friendly Vision Transformer (4 Mar 2022) – hereinafter “MobileViT” or “Mehta”) in view of Peng et al. (Conformer: Local Features Coupling Global Representations for Visual Recognition – hereinafter “Peng” or “Conformer”).
Claims 1, 13, and 20.
MobileViT discloses a system, comprising: a memory; and a processor (p. 9: “GPUs” and “memory”) configured to execute instructions from the memory to configured the processor to; 
receive content as an input to a neural network for performance of a computer vision task (MobileViT  p. 2: “combine the strengths of CNNs and transformers to build ViT models for mobile vision tasks”), 
wherein the neural network comprises a mobile vision transformer block comprising a local representation block, a global representation block, and a fusion block (MobileViT  p. 2, Fig. 1(b) shows a local representation block, a global representation block, and a fusion block – shown below);

    PNG
    media_image1.png
    184
    781
    media_image1.png
    Greyscale
  
generate, by applying at least one convolutional layer of the local representation block on the input (Fig. 1(b), Conv-n x n), a local representation output comprising a local representation for each portion of the content located at each location of a plurality of locations within the content (MobileViT on local representations for context, p. 4: "applies a n x n convolution ... followed by a point-wise (1x1) … The n x n layer encodes local spatial information”); 
concatenate, in the fusion block, (MobileViT p. 5: "XF is ... combined with X via concatenation ... Another n x n convolutional layer is then used to fuse these concatenated features." Where, MobileViT's global representation comes from applying transformers to unfolded patches – see next element.); 
generate, by utilizing a fusion convolutional layer of the fusion block (Fusion conv in the fusion block – “Another nxn convolutional layer is then used to fuse these concatenated features”), a fusion block output based on the concatenated local and global representation (MobileViT p. 5: Global representation –"We unfold XL into … patches ... apply transformers to obtain XG ... we can fold XG … to obtain XF.  XF … is combined X with via concatenation operation”); and 
fuse input features associated with the input with the fusion block output to generate an output of the neural network to facilitate performance of the computer vision task (Fusing input features with the fused output (MobileViT) p. 2: The fusion stage concatenates XF with X (the input features) and uses a convolution to fuse them, yielding the block output Y (see Fig. 1(b) pipeline); p. 5: "XF ... combined with X via concatenation ... Another nxn convolution ... to fuse these, concatenated features." MobileViT shows utility on CV tasks (classification/detection/segmentation).).
MobileViT discloses all of the subject matter as described above except for specifically teaching “the local representation output with a global representation output associated with the content to generate a concatenated local and global representation of the content.”  However, Conformer in the same field of endeavor teaches the local representation output with a global representation output associated with the content to generate a concatenated local and global representation of the content (Conformer p. 2: “Feature Coupling Unit (FCU) … since CNN and transformer branches tend to capture features of different levels (e.g., local vs. global), FCU is inserted into every block to consecutively eliminate the semantic divergence between them, in an interactive fashion. Such a fusion procedure can greatly enhance the global perception capability of local features and the local details of global representations.”; p. 4: “FCU is proposed as a bridge module to fuse local features in the CNN branch with global representations in the transformer branch, Fig. 2(b).”.).
Therefore, it would have been obvious to one of ordinary skill in the art to combine MobileViT and Conformer before the effective filing date of the claimed invention.  It would have been obvious to modify MobileViT’s fusion block to concatenate the local representation with the global representation (instead of the raw input), as taught by Conformer’s Feature Coupling Unit (FCU).  The motivation is to enhance the representation learning of the network by retaining both the fine-grained local details (from the CNN/local block) and the long distance feature dependencies (from the Transformer/global block) before final processing. 
Claims 13 and 20 recite the same technical steps/structures in method and apparatus form. The teachings cited above (MobileViT for block structure, global representation, concatenation + fusion conv; Conformer’s for the FCU fusion of the local representation with the global representation) read on those claims for the same reasons.
Claim 2.
The combination of MobileViT and Conformer discloses the system of claim 1, wherein the processor is further configured to generate, by utilizing the global representation block, the global representation output for an entire portion of the content (MobileViT: p. 5: "each pixel in X_G can encode information from all pixels in X" each pixel in can encode information from all pixels in X_G, as shown in Figure 4. Thus, the overall effective receptive field of MobileViT is H x W.” Figure 4 caption: "Every pixel sees every other pixel in the MobileViT block").
Claim(s) 3 and 14.
The combination of MobileViT and Conformer discloses the system of claim 1, wherein the processor is further configured to generate a feature map from the content by utilizing the neural network (MobileViT: Page 2, Figure 1 (b): Shows feature maps at various spatial dimensions (128x128, 64x64, 32x32, 16x16, 8x8)).
Claim(s) 6 and 17.
The combination of MobileViT and Conformer discloses the system of claim 1, wherein computer vision task comprises content classification associated with the content, segmentation associated with the content, object detection associated with the content, or a combination thereof (MobileViT Page 7: "we first evaluate MobileViTs performance on the lmageNet-1 k dataset" (classification); Page 8: "4.2.1 Mobile Object Detection" and "4.2.2 Mobile Semantic Segmentation" sections; Table 1: Detection w/ SSDLite; Table 2: Segmentation w/ Deeplabv3).
Claim 7.
The combination of MobileViT and Conformer discloses the system of claim 1, wherein the processor is further configured to initiate generation of the global representation output based on an unfolded version of the local representation output, wherein the unfolded version of the local representation output comprises N non-overlapping flattened patches associated with the content (MobileViT Page 5: "we unfold X_L into N non-overlapping flattened patches X_U; Figure 1 (b): Shows "Unfold" operation in the MobileViT block diagram to the output of the local representation output).
Claim 8.
The combination of MobileViT and Conformer discloses the system of claim 7, wherein the processor is further configured to apply a transformer to the unfolded version of the local representation output during generation of the global representation output (MobileViT Page 5: "inter-patch relationships are encoded by applying transformers to obtain X_G";  Equation (1 ): "X_G(p) = Transformer(X_U(p)), 1 ….).
Claim 9.
The combination of MobileViT and Conformer discloses the system of claim 8, wherein the processor is further configured to conduct a folding operation after application of the transformer to generate the global representation output (MobileViT Page 5: "we can fold X_G … to obtain X_F"; Figure 1 (b): Shows "Fold" operation after the Transformer block).
Claim(s) 10 and 19.
The combination of MobileViT and Conformer discloses the system of claim 1, wherein the processor is further configured to apply, in the fusion block, a convolution to the global representation prior to concatenation of the local representation with the global representation (MobileViT Figure 1 (b): Shows "Conv-1x1" before the Fusion block in the data flow).
Claim 11.
The combination of MobileViT and Conformer discloses the system of claim 1, wherein fusion convolutional layer comprises a 1x1 convolutional layer (MobileViT Figure 1 (b): Shows "Conv-1x1" before the Fusion block in the data flow).
Claim(s) 12 and 16.
The combination of MobileViT and Conformer discloses the system of claim 1, wherein the processor is further to generate the output of the neural network to facilitate the performance of the computer vision task based on addition of the input features to the fusion block output (MobileViT Figure 1 (b): Shows a skip connection (red arrow) around the MobileViT block; 
Page 15 (Appendix C): "Impact of skip-connection ... With this connection , the
performance of MobileViT-S improves by 0.5%").

Claims 5 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over MobileViT in view of Conformer as applied to claims 1 and 13 above, and further in view of Chen et al. (Mobile-Former: Bridging MobileNet and Transformer (v3 – last revised 3 Mar 2022) – hereinafter “Chen” or “Mobile-Former”).
Claim(s) 5 and 18.
The combination of MobileViT and Conformer discloses the system of claim 1, wherein the processor is further configured to generate the local representation output by applying a 1x1 convolution after applying (MobileViT Page 4: "MobileViT applies a n x n standard convolutional layer followed by a pointwise (or 1 x 1) convolutional layer to produce X_L"; 
MobileViT discloses all of the subject matter as described above except for specifically
teaching “a depthwise-separable convolutional layer.” However, Mobile-Former in the same field of endeavor teaches a depthwise-separable convolutional layer (Mobile-Former Page 4: "depthwise and pointwise convolution"  Mobile-Former teaches depthwise-separable (via inverted Bottleneck). 
Therefore, it would have been obvious to one of ordinary skill in the art to combine MobileViT, Conformer, and Mobile-Former before the effective filing date of the claimed invention.  It would have been obvious to modify the local representation of MobileViT and Conformer to replace the standard initial convolution with a depthwise-separable convolution, as taught by Mobile-Former, to increase computational efficiency.  Both MobileViT and Mobile-Former are directed to optimizing neural networks for resource constrained mobile devices.  A PHOSITA would recognize this modification is the simple substitution of one known element with another to obtain the predictable result of reducing the network parameter count and computational latency while preserving its ability to accurately extract local spatial features. 


Allowable Subject Matter
Claims 4 and 15 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims along with a timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) to overcome the rejection based on nonstatutory double patenting.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ross Varndell whose telephone number is (571)270-1922.  The examiner can normally be reached M-F, 9-5 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, O’Neal Mistry can be reached at (313)446-4912.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/Ross Varndell/Primary Examiner, Art Unit 2674

Read full office action

Prosecution Timeline

Jul 26, 2023

Application Filed

Nov 26, 2025

Non-Final Rejection — §103, §DP

Mar 02, 2026

Response Filed

Mar 11, 2026

Final Rejection — §103, §DP (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/188,565

Patent 12603810

System and Method for Communications Beam Recovery

2y 5m to grant Granted Apr 14, 2026

18/356,461

Patent 12597238

AUTOMATIC IMAGE VARIETY SIMULATION FOR IMPROVED DEEP LEARNING PERFORMANCE

2y 5m to grant Granted Apr 07, 2026

17/433,084

Patent 12582348

DEVICE AND METHOD FOR INSPECTING A HAIR SAMPLE

2y 5m to grant Granted Mar 24, 2026

18/053,348

Patent 12579441

SYSTEMS AND METHODS FOR IMAGE RECONSTRUCTION

2y 5m to grant Granted Mar 17, 2026

18/370,758

Patent 12579786

SYSTEM AND METHOD FOR PROPERTY TYPICALITY DETERMINATION

2y 5m to grant Granted Mar 17, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

85%

Grant Probability

98%

With Interview (+13.0%)

2y 4m

Median Time to Grant

Moderate

PTA Risk

Based on 615 resolved cases by this examiner. Grant probability derived from career allow rate.