DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claim 1, and therefore claims 2–10 which depend therefrom, are objected to because of the following informalities, with reference to the original claim set filed December 23, 2023:
In lines 4, 6, and 27, the steps “the original images of road surface are collected and a dataset is established,” “the images from the dataset are inputted,” and “the feature maps obtained from the backbone are inputted,” are recited, but better form including corrections for proper antecedent basis for these limitations would be as follows: “collecting original images of road surface and establishing a dataset from the original images,” “inputting the original images into the backbone,” and “inputting the feature maps obtained from the backbone.”
In lines 11, 14 and 25, the stair1, stair2 and stair3 limitations are not recited as method steps, but rather recite limitations as to the respective stair structures. These limitations should be either rewritten in wherein clauses, or in better form by positively reciting the method steps/operations realized by these structures and making those operations comprised within the claimed method.
A review of all limitations in claim 1 should be made for proper antecedent basis, with the majority of the objections being based on incorrectly introducing new limitations with the definite article “the” instead of “a” or “an.” For example, in line 11, “the expansion factor” should instead be “an expansion factor,” in line 14, “the kernel stride” should instead be reciting as “a kernel stride,” and “the channels” should just be “channels” or “a plurality of channels,” in line 17, “the two sets of channels,” should instead be “the two equal parts of channels,” and in line 18 “the shuffle operation,” should be instead “a shuffle operation.”
Claim 2 line 2 recites “the Faster R-Stair model,” but should recite “a Faster R-Stair model,” line 3 reciting “the batch normalization layer” should recite “a batch normalization layer,” line 9 recites “the output feature map after Batch normalization, m represents the number” but should instead recite “an output feature map after the batch normalization, m represents a number.”
Both of claims 3 and 4 recite “the Faster R-Stair model,” but should recite “a Faster R-Stair model,” and lines 6–7 recite “the input data” and “the output data,” but should recite just “input data” and “output data” (no “the”) as these limitations are being newly introduced.
Claim 5 recites “the Faster R-Stair model,” but should recite “a Faster R-Stair model,” in lines 2–3 recites “the enhanced feature map of road cracks,” but should recite “an enhanced feature map of road cracks,” line 8 recites “the ECA mechanism” but should simply recite “the ECA,” in line 9 “in this patent” should be removed, “denotes the feature maps output from the ECA” should instead recite “denotes feature maps output from the ECA,” in line 10, “the sigmoid operation” should be “a sigmoid operation,” and “the convolution operation,” should be “a convolution operation.”
Claim 6 recites “the Faster R-Stair model,” but should recite “a Faster R-Stair model,” in line 3 recites “the feature extraction capability,” but should recite “a feature extraction capability,” in lines 3–4 recites “containing the channel attention” should be “containing a channel attention, line 5 reciting “apply the average pooling” should be “apply average pooling,” line 6 “the channel dimensions” should instead be “channel dimensions,” lines 7–8 “the channel attention map” should instead be “a channel attention map,” line 10 reciting “the output feature map after the channel attention process,” should be “an output feature map after a channel attention process,” line 11 reciting “denotes the fully connected layers, σ is sigmoid operation,should instead be “denotes fully connected layers, σ is a sigmoid operation,” line 12 reciting “AvgPool() is the average pooling operation, MaxPool() is the max pooling” should recite “AvgPool() is an average pooling operation, MaxPool() is a max pooling,” line 18 recites “denotes the output feature map after the spatial attention process,” but should recite “denotes an output feature map after a spatial attention process,” and line 19 recites “the 7 X 7 convolution operation,” but should recite “a 7 X 7 convolution operation.”
Claim 7 recites “the Faster R-Stair model,” but should recite “a Faster R-Stair model,” line 2 recites “of anchor” but should recite “of an anchor,” line 3 recites “and RPN head,” but should recite “and an RPN head,” line 5 recites “convolutional layer … the training processes,” but should recite “convolutional layers … wherein training processes,” line 9 recites “two parallel convolutional layers contains the target scores,” but should recite “two parallel 1 X 1 convolutional layers contains target scores,” line 13 recites “where cls is the target score representing the crack probability,” but should recite “where cls is a target score representing a crack probability,” line 14 recites “is the regression parameter of the ith anchor box,” but should recite “is a regression parameter of an ith anchor box,” line 16 recites “the regression parameters output from the RPN head is used,” but should recite “regression parameters output from the RPN head are used.”
Claim 8 recites “the Faster R-Stair model,” but should recite “a Faster R-Stair model,” and line 12 recites “the GT box” but should instead recite “a ground truth (GT) box,” assuming that GT intends to refer to the plain meaning as understood by a person having ordinary skill in the art of machine learning in a training context as “ground truth.”
Claim 9 recites “the Faster R-Stair model,” but should recite “a Faster R-Stair model,” line 4 recites “by ROI pooling” but should recite “by the ROI pooling,” line 8 recites “to the steps in RPN” but should recite “to steps in RPN,” line 14 recites “comprises the loss of crack classification and loss of regression parameters,” but should recite “comprises a loss of crack classification and a loss of regression parameters,” line 15 recites “is the softmax … u is the label,” but should recite “is a softmax … u is a label,” line 16 recites “is the regression parameter,” but should recite “ is a regression parameter,” line 17 recites “is the regression parameter of the GT box corresponding to the real target,” but should recite “is a regression parameter of a ground truth (GT) box corresponding to a real target,” line 18 recites “denotes the Iverson bracket; the Adam algorithm,” but should recite “denotes an Iverson bracket; an Adam algorithm,” line 19 recites “optimize the internal parameters,” but should recite “optimize internal parameters,” line 20 recites “The formula for calculating the final bounding box coordinates” but should recite “the formula for calculating final bounding box coordinates,” line 26 recites “the height of the final predicted bounding boxes … are the regression,” but should recite “the height of final predicted bounding boxes … are regression.”
Claim 10 recites “the Faster R-Stair model,” but should recite “a Faster R-Stair model,” line 2 recites “the internal parameters,” but should recite “internal parameters,” line 3 recites “using the Adam algorithm,” but should recite “using an Adam algorithm,” line 12 recites “the parameters to be updated … gt denotes the gradient” but should recite “parameters to be updated … gt denotes a gradient,” line 13 recites a period “.” after the theta and before the beta characters but should not have a period there, and line 13 also recites “is the first-order moment … β2 is the,” but should recite “is a first-order moment … β2 is a,” line 14 recites “represents the expected value,” but should recite “represents an expected value,” line 15 recites “represents the expected value … is the bias correction,” but should recite “represents an expected value … is a bias correction, line 16 recites “is the bias correction” but should recite “is a bias correction,” and lines 16 and 17 both have periods, which should be removed, and finally line 18 recites “the learning rate,” but should recite “a learning rate.”
Claims 2–10 all recite “where in a feature of the method for fast detecting road cracks,” but for better readability in the claims should instead recite “wherein the method further comprises:” and the subsequent limitations recited as method steps.
Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1–10 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Regarding claim 1, and therefore claims 2–10 which dependent therefrom, are indefinite for the following reasons:
Lines 18 and 23–24 recite “the shuffle operation,” but shuffle operation is not introduced earlier in the claim, and it is unclear and indefinite whether the second recited “the shuffle operation” in lines 23–24 is the same shuffle operation as that recited in line 18, or a different shuffle operation.
In lines 21 and 22, “dimension reduction” is recited, and it is unclear and indefinite whether the second recited dimension reduction in line 22 is the same one as recited in line 21, or is different.
Line 27 recites “the feature maps obtained from the backbone” and then on lines 28–29 is recited “the feature maps outputted from the backbone,” and its unclear and indefinite whether these two limitations are the same “feature maps” or if they are different feature maps.
Lines 29–30 recite “obtain corresponding feature matrices, the feature matrix is,” which is unclear and indefinite because it is unclear whether “the feature matrix” is one of the obtained corresponding feature matrices (plural) or if it is a different feature matrix.
Regarding claim 3, line 2 recites “the data” but there is no antecedent basis for this limitation, and it is unclear and indefinite. Line 3 recites “RELU6 (RE) activation function in each layer,” however, “layer” has no antecedent basis, and it is unclear and indefinite to what the “each layer” intends to refer.
Regarding claim 4, similar to the rejection above in claim 3, claim 4 also recites “the data” and “in each layer” both of which have no antecedent basis and are unclear and indefinite.
Regarding claim 5, in line 4 is recited “the data” and lines 7–8 recites “in the input data,” however, these limitations do not have antecedent basis and are unclear and indefinite. Further, line 9 recites “in this patent,” which also lacks antecedent basis, and is also unclear and indefinite.
Regarding claim 6, in lines 5–6 is recited “apply the average pooling and max pooling … this helps to compress the channel dimensions … and merge them by element-wise summation to generate the channel attention map,” which is unclear and indefinite because it is unclear whether “helps to” then requires the compress and merging by element-wise summation, or if the “helps to” means to require that some assistance towards, but not necessarily realization of, compressing and merging, is required to meet the limitations of claim 6.
Regarding claim 7, in lines 5 and 8 is recited “a ReLU activation function,” and therefore, it is unclear and indefinite whether the second “a ReLU activation function” is intended to be the same one as the first recited “ReLU activation function,” or if they are different. Further, in line 17 is recited “the formula of the process” is recited, but “the process” lacks antecedent basis and is unclear and indefinite. Still further line 25 recites “the regression parameters predicted by the RPN head,” however, line 16 recites “the regression parameters output from the RPN head,” and therefore is unclear and indefinite whether the predicted parameters are intended to be the same as the parameters output, or if they are different parameters.
Regarding claim 9, line 8 recites “parameters for each proposal, similar to the steps in RPN,” which is unclear and indefinite because “the steps in RPN” lacks antecedent basis, and what would be considered “similar to” is an indefinite term of degree. Further, lines 8–9 recite “the losses of the fully connected layers should be calculated,” where “should be” is unclear and indefinite as it unclear whether the claimed calculations are required, or merely suggestions, not required to be part of the claim scope. Still further, line 19 recites “parameters of the model,” where “the model” lacks antecedent basis and is unclear and indefinite.
Regarding claim 10, line 2 recites “the network,” line 11 recites “the RPN or ROI Head network,” and line 12 recites “the model,” where none of these limitations have antecedent basis and are unclear and indefinite. In lines 16 and 17 is recited “the parameters,” which lack antecedent basis and are unclear and indefinite.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b).
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
Claims 1–6 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 and 4–8 of copending Application No. 18/398,205 (herein “‘205 application”) in view of Guo et al., “Road damage detection algorithm for improved YOLOv5,” Sci Rep 12, 15523, September 15, 2022, https://doi.org/10.1038/s41598-022-19674-8 (herein “Guo”) in view of Christian et al., "Radar and Camera Fusion for Object Forecasting in Driving Scenarios," 2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC), Penang, Malaysia, December 22, 2022, pp. 105-111, doi: 10.1109/MCSoC57363.2022.00026 (herein “Christian”). This rejection is made in view of the claims as best understood, given the indefiniteness issues listed above, for compact prosecution purposes.
This is a provisional nonstatutory double patenting rejection.
Regarding claim 1 of the present application, the correspondence to claim 1 of the ‘205 application is as follows, with deficiencies noted in square brackets [], and examiner added explanation added in curly brackets {}:
Limitation from claim 1 of the present application
Limitation from claim 1 of the ‘205 application
A method for fast detecting road cracks in images using a depth-stage dependent and hyperparameter-adaptive lightweight CNN-based model, comprising the following steps:
A three-stage modularized CNN for rapidly classifying concrete cracks {includes cracks in roads made of concrete} in images,
comprising the following steps: {steps teaching a method}
the structural blocks in stair2 have two variations: when the stride is set to 1 … when the stride is set to 2 {thus hyperparameter adaptive, as strides are hyperparameters}
one part of the channel passes through an inverted residual structure that includes a depthwise separable convolution (DConvs) {thus depth-stage dependent}
[the original images of road surface] are collected and a dataset is established, including a training set and a validation set;
a concrete crack dataset is built for training the CNN;
[the images from the dataset are inputted into the backbone (backbone-Stair) to obtain feature maps, the backbone is depth-stage dependent,] which includes suitable structures in different depths: convolutional layer, stair1, a convolution block attention module (CBAM), stair2, another CBAM, and stair3, the basic components in stairl and stair2 have some variations according to the adjustment of some hyperparameters;
the structure of the three-stage modularized CNN, which could be called Stairnet, consists of an input layer, blocks of stair1 in early layers, a convolutional block attention module (CBAM), blocks of stair2 in mid-layers, another CBAM, blocks of stair3 in late layers, and a fully connected layer;
there are more layers in stair2, compared with stair1 and 3 {the basic components in stairl and stair2 have some variations}; the structural blocks in stair2 have two variations: when the stride is set to 1 … when the stride is set to 2 {according to the adjustment of some hyperparameters}
stair1 has two variations as follows: when the expansion factor is 1, [the input feature maps] go through an inverted residual structure with convolutions; when the expansion factor is not 1, [the input feature maps] go through a convolutional operation;
the shallow layers of the model can be referred to as stair1 and are constructed using inverted residual blocks that exclusively consist of convolutions (Convs); {all situations with any value of expansion factor are covered, including when the expansion factor is 1, processing includes inverted residual blocks with convolutions, and when the expansion factor is not 1, the processing is still the inverted residual blocks consisting of convolutions, thus a convolutional operation}
stair2 has two variations as follows: when the kernel stride is 1, the channels of the input feature maps are split into two equal parts using the split operation, one part goes through an inverted residual structure with depth-wise separable convolutions, while the other part remains unchanged, after that, the two sets of channels are concatenated and then subjected to the shuffle operation;
when the kernel stride is 2, the channels of the input feature maps are replicated into three copies, one copy goes through an inverted residual structure, another copy goes through a depth-wise separable convolution followed by dimension reduction, the last copy goes through a max pooling operation followed by dimension reduction, finally, the three sets of dimension-reduced channels are concatenated and then subjected to the shuffle operation;
the structural blocks in stair2 have two variations: when the stride is set to 1, the stair2 structure involves performing a split operation on the input channel; one part of the channel passes through an inverted residual structure that includes a depthwise separable convolution (DConvs), while the other part does not undergo any operation; a shuffle operation is performed on the two channels that are concatenated;
when the stride is set to 2, the stair2 structure involves copying the input channel; one part of the channel is reduced in dimension through an inverted residual structure with a depthwise separable convolution, another part is reduced in dimension through the depthwise separable convolution, and the third part is reduced in dimension through maximum pooling; a shuffle operation is performed on the three channels that are reduced in dimension after performing a concatenate operation;
stair3 consists of a residual structure consisting of depth separable convolution and efficient channel attention (ECA);
the deep layer of the model can be referred to as stair3, including inverted residual structures containing depthwise separable convolutions and efficient channel attention (ECA) modules.
[the feature maps obtained from the backbone are inputted to a region proposal network (RPN) to generate proposals, the proposals are projected onto the feature maps outputted by backbone to obtain corresponding feature matrices,]
[the feature matrix is passed through a region of interest (ROI) head to output predicted bounding boxes of the road cracks in the feature maps, the predicted bounding boxes of the road cracks in the feature maps are mapped back to the original image using post-processing to obtain the positions and types of road cracks in the original image.]
Claim 1 of the ‘205 application does not teach or recite, but Guo teaches the original images of road surface (Guo page 8, dataset used is the open source dataset RDD2020 consisting of road images from different countries), the images from the dataset are inputted into the backbone (backbone-Stair) to obtain feature maps (Guo pages 3–4 and 8, dataset is used to input images into the model, including a backbone comprised of the MobileNetV3 for feature extraction (to obtain feature maps)), the backbone is depth-stage dependent (Guo page 3, backbone including depthwise convolution adopting different convolution kernels for each input channel (depth-stage dependent)), the input feature maps (Guo page 4, inverted resblock first expands to perform feature extraction (feature maps)), the feature maps obtained from the backbone are inputted to a region proposal network (RPN) to generate proposals (Guo pages 6–7, auto anchor searching, the model learns the location and size of object and using an a priori box mechanism (region proposal network), an a priori bounding box is obtained and matched to the corresponding feature map), of the road cracks in the feature maps (Guo page 10, Kmeans clustering algorithm used to select anchor boxes suitable pavement crack detection), the predicted bounding boxes of the road cracks in the feature maps are mapped back to the original image using post-processing to obtain the positions and types of road cracks in the original image (Guo pages 10–11, fig. 7 detection effect of the algorithm shown in fig. 7, which illustrates bounding boxes around different types of road cracks that are labeled such as D00 0.4, and D20 0.6 in the original image).
Further, claim 1 of the ‘205 application does not teach or recite, but Christian teaches the proposals are projected onto the feature maps outputted by backbone to obtain corresponding feature matrices, the feature matrix is passed through a region of interest (ROI) head to output predicted bounding boxes (Christian page 107, and page 110, fig. 5, Proposals Generator and Fast R-CNN section teaching using Fast R-CNN to take object proposals for a frame (image) and project the proposals onto the feature maps to get mapped regions of interest (corresponding feature matrices) then an ROI pooling layer (ROI head) is applied and a fully connected network is used to generate the predicted bounding boxes).
Therefore, taking the teachings and recitations of claim 1 of the ‘205 application and Guo together as a whole, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the image processing claimed in claim 1 of the ‘205 application with the bounding box processing and specific road images of Guo at least because doing so would improve detection accuracy. See Guo Abstract.
Further, taking the teachings and recitations of claim 1 of the ‘205 application and Christian together as a whole, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the image processing claimed in claim 1 of the ‘205 application with the object proposal projection of Christian at least because doing so would reduce computational complexity. See Christian page 107, left column.
Regarding claim 2 of the present application, claim 2 corresponds with claim 4 of the ‘205 application.
Regarding claim 3 of the present application, claim 3 corresponds with claim 5 of the ‘205 application.
Regarding claim 4 of the present application, claim 4 corresponds with claim 6 of the ‘205 application.
Regarding claim 5 of the present application, claim 5 corresponds with claim 7 of the ‘205 application.
Regarding claim 6 of the present application, claim 6 corresponds with claim 8 of the ‘205 application.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Woo et al., “CBAM: Convolutional Block Attention Module,” arXiv:1807.06521v2 [cs.CV], https://doi.org/10.48550/arXiv.1807.0652, July 18, 2018, directed towards a convolutional block attention module for providing attention for feed-forward convolutional neural networks.
Li et al., “Alpha-SGANet: A multi-attention-scale feature pyramid network combined with lightweight network based on Alpha-IoU loss,” PLoS ONE 17(10), e0276581, https://doi.org/10.1371/jounal.pone.0276581, October 27, 2022, directed towards a multi-attention-scale feature pyramid network combined with a lightweight network, including a prediction head to detect different-scale objects, a feature extraction network using ShuffleNetV2 in the backbone, a Ghost module to construct feature maps to help prediction, and a CBAM module to find areas of interest in the scene.
Ma et al., “ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design,” Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 116-131, directed towards a CNN design that includes building blocks with parallel processing paths, different for different stride values, including depth-wise convolution.
Hu et al., WO 2023159558 A1, directed towards a feature extraction network and region proposal network to track an object with a bounding box in an image.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHELLE M KOETH whose telephone number is (571)272-5908. The examiner can normally be reached Monday-Thursday, 09:00-17:00, Friday 09:00-13:00, EDT/EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached at 571-272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
MICHELLE M. KOETH
Primary Examiner
Art Unit 2671
/MICHELLE M KOETH/Primary Examiner, Art Unit 2671