DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-4, 6-11, 13-15 and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable by Edgar et al. (U.S. 2021/0034920 A1) in view of Fathi et al. (U.S. 2020/0082168 A1).
Regarding Claim 1, Edgar discloses one or more non-transitory computer-readable media (Edgar, [0107] “a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects”) storing computer-executable instructions that, when executed by at least one processor, perform a method of automated classification and labeling of objects in a plurality of images (Edgar, [0039] “optimizing one or more machine learning models (e.g., machine learning model 110, M1) using accurately annotated/labeled training data samples” and [0040] the machine learning model M1 can include a DNN image analysis model configured to automatically generate an inference classification based on the medical images” Edgar teaches a method (machine learning model) performs automated classification and labeling of objects (data samples) in a plurality of images, the method comprising:
obtaining the plurality of images (Edgar, [0047] “the collection component 202 can collect or receive hundreds to thousands to millions (or more) of unannotated medical images” Edgar teaches obtaining the plurality images;
wherein the plurality of images depicts a region of interest and comprises satellite imagery (Edgar, [0050] “the annotation pipeline module 112 can learn that the model consistently generates low confidence diagnosis for medical images from a specific geographic region and [0121] “A user enters information into the computer 1502 through input device(s) 1528. Input devices 1528 include satellite dish” Edgar teaches images show a region of interest (referred to as a specific geographic region) and include satellite imagery (image is entered via a satellite dish);
reducing the plurality of images to a first subset and a second subset based on similarities between each image of the first subset (Edgar, [0026] “using numerical, optimization techniques that reduce the error between the desired class label and the algorithm's prediction” [0034] “apply the machine learning model to the unannotated data sample to generate an inference result and compare this inference result with the applied annotation to facilitate determining the accuracy of the annotation” and [0057] “the machine learning model (M1) based on application to the new unannotated data samples included in the annotation queue with same or similar attributes and [0096] “At 906, the system can select, a first annotation technique (e.g., a manual annotation technique) for annotating a first subset of the unannotated data samples based on association of the first subset with a first priority level (e.g., a high priority level relative to a defined threshold) of the priority levels At 908, the system can further select, a second annotation technique (e.g., an automated annotation technique) for annotating a second subset of the unannotated data samples based on association of the second subset with a second priority level (e.g., a low priority level relative to a defined threshold) of the priority levels) Edgar teaches a first subset a first subset of the unannotated data samples based on association with a high priority level and a second subset of the unannotated data samples based on association with a low priority level and comparing based on the unannotated data samples with similar attributes whether association with a high priority level or a low priority level to reduce error between the desired class label in the first subset.
wherein the second subset comprises images of the plurality of images not included in the first subset (Edgar, [0096] “At 906, the system can select, a first annotation technique (e.g., a manual annotation technique) for annotating a first subset of the unannotated data samples based on association of the first subset with a first priority level (e.g., a high priority level relative to a defined threshold) of the priority levels At 908, the system can further select, a second annotation technique (e.g., an automated annotation technique) for annotating a second subset of the unannotated data samples based on association of the second subset with a second priority level (e.g., a low priority level relative to a defined threshold) of the priority levels) Edgar teaches the second subset images are not included in the first subset because the second subset of the unannotated data samples based on association with a low priority level;
forward projecting the annotations to each image of the first subset to obtain annotated images (Edgar, [0012] “Fig. 4, example subsets of annotated data samples generated by the annotation component in association with application to medical images” Edgar teaches forward projecting the annotations to each image e.g., example subsets of annotated data samples generated to medical images of the first subset to obtain annotated images (Fig. 4).
training a neural network by the first subset including the annotated images (Edgar, [0005] “select the first annotation technique for a first subset of the unannotated data samples based on association of the first subset with a first annotation priority level” [0039] “the model development module 108 can facilitate training and/or optimizing one or more machine learning models (e.g., machine learning model 110, M1) using accurately annotated/labeled training data samples” and [0044] “the machine learning model M1 can be a neural network model” Edgar teaches the facilitate training the first subset of unannotated data samples use accurately annotated training data samples by a machine learning model/neural network model; and
classifying and labeling the objects in the second subset by the neural network (Edgar, [0026]“The class labels are then used by the learning algorithm to adapt and change its' internal, mathematical representation (such as the behavior of artificial neural networks) of the data” and [0039] “optimizing one or more machine learning models (e.g., machine learning model 110, M1) using accurately annotated/labeled training data samples” and [0097] “select, a second annotation technique (e.g., an automated annotation technique) for annotating a second subset of the unannotated data samples” Edgar teaches the class labels the objects (data samples) in the second subset of the unannotated data samples by learning algorithm of artificial neural network.
Edgar discloses the annotation management component 204 can facilitate rendering the prioritization information at a device associated with a user responsible for managing and/or controlling annotation of the unannotated data samples ([0052]).
However, Edgar does not explicitly teach receiving annotations of the objects in a Digital Surface Model (DSM) or orthorectified image created from the first subset;
Fathi teaches receiving annotations of the objects in a Digital Surface Model (DSM) or orthorectified image created from the first subset (Fathi, [0193] “For semantics, at least the following functionality can be pertinent: free-form 2D and 3D annotation and relational notes to individual or group of objects” [0195] tool optimization points to support 3D annotation, machine learning” and [0239] a 3D rendering of the asset in the form of digital surface model” Fathi teaches receiving a 3D annotations of individual or group of objects in the form of digital surface model via a 3D rendering objects.
Edgar and Fathi are combinable because they are from the same field of endeavor, system and method for image processing and try to solve similar problems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made for modifying the method of Edgar to combine with receiving annotations of the objects in a Digital Surface Model (DSM) (as taught by Fathi ) in order to receive annotations of the objects in a Digital Surface Model (DSM) via rendering 3D objects to because Fathi can provide receiving a 3D annotations of individual or group of objects in the form of digital surface model via a 3D rendering objects (Fathi , [0193], [0195], [0239]). Doing so, it may provide image that has the highest SSD for a specific object or surface, image that has the minimum occlusion for a specific object or surface (Fathi , [0240]).
Regarding Claim 2, the media of claim 1, Edgar does not explicitly teach wherein the method further comprises:
determining the first subset by comparing viewing parameters between each image of the plurality of images; and selecting one or more pairs of the images at the first subset comprising compatible viewing parameters.
However, Fathi teaches determining the first subset by comparing viewing parameters between each image of the plurality of images; and selecting one or more pairs of the images at the first subset comprising compatible viewing parameters (Fathi, [0126] “Image processing techniques as discussed elsewhere herein that suitably compare two or a plurality of images together and highlights the differences and/or track the severity and size change over time can be used” and [0076] “This linkage structure can be interpreted as a generalization scheme and will be based on linking the features from each pair of views. Viewpoint parameters are also represented by T and S” Fathi teaches determining the subset by comparing two images and highlight the differences of the severity and size change (viewing parameters) and select the features from each pair of views include viewpoint parameters (T, S).
Edgar and Fathi are combinable see rationale in claim 1.
Regarding Claim 3, the media of claim 2, Edgar does not explicitly teach wherein the method further comprises determining a digital surface model from the one or more pairs of the images;
However, Edgar teaches determining a digital surface model from the one or more pairs of the images (Edgar, [0126] “Image processing techniques as discussed elsewhere herein that suitably compare two images together” and [0239] “the capture plan on a 3D rendering of the asset in the form of an orthographic aerial imagery, digital surface model” Edgar teaches determining a digital surface model assets in the form of an aerial imagery and applying to image processing to two images together (a pair of the images).
Edgar and Fathi are combinable see rationale in claim 1.
Regarding Claim 4, the media of claim 3, Edgar does not explicitly teach wherein the method further comprises orthorectifying and aggregating an image
However, Fathi teaches the method further comprises orthorectifying and aggregating an image (Fathi, [0003] “should be directed toward a goal of obtaining improved accuracy (e.g., accurate measurements, enhanced detail, etc.) and thus more complete data collection during an imaging process” and [0074] “the sensor types and associated sensor data characteristics can be selected to allow collection of physical assets of interest to be generated. The selected sensor types and data attainable can include manned or unmanned aerial orthographic imagery” Fathi teaches the method further includes aggregating (collecting data) during an imaging process and an orthographic imagery.
Edgar and Fathi are combinable see rationale in claim 1.
Regarding Claim 6, the media of claim 1, Edgar does not explicitly teach wherein the objects are buildings, wherein the annotations are indicative of shapes of the buildings.
However, Fathi teaches the objects are buildings, wherein the annotations are indicative of shapes of the buildings (Fathi, [0030] “a “physical asset of interest” includes buildings, all or parts (e.g., internal or external) of a building” and [0005] “an assessment of basic information about a physical asset such as the general size, shape, materials used etc.” and [0121] “labeling and classification can be relevant elements in inspection for assessment of a physical asset(s) of interest” Fathi teaches objects (physical assets of interest) are building and labeling/classification of shapes of the buildings.
Edgar and Fathi are combinable see rationale in claim 1.
Regarding Claim 7, Edgar discloses the media of claim 6, wherein the method further comprises:
classifying and labeling the second subset by: inputting the images of the second subset into the neural network (Edgar, [0026] “The class labels are then used by the learning algorithm to adapt and change its' internal, mathematical representation (such as the behavior of artificial neural networks) of the data” and [0066] “The annotation application can further generate and apply an annotation or label to the medical image based on the user input” Edgar teaches applying the class labels to the second subset of the medical image based on the user input into an artificial neural networks; and
However, Edgar does not explicitly teach training the neural network for building segmentation based on the first subset; and
determining a probability of each pixel containing a building based on the neural network.
Fathi teaches training the neural network for building segmentation based on the first subset (Fathi, [0030] “a “physical asset of interest” includes buildings, all or parts (e.g., internal or external) of a building” and [0078] “Machine learning-based object identification, segmentation, and/or labeling algorithms can be used to identify a collection physical asset of interest. A directed graph can be used to build the neural networks. Each unit can be represented by a node labeled according to its output” Fathi teaches training (machine learning-based) the neural network for the building segment (object segment of the physical asset) based on the subset (1st subset); and
determining a probability of each pixel containing a building based on the neural network (Fathi, [0030] “a “physical asset of interest” includes buildings” and [0061] “the one or more physical assets of interest can be generated by using probabilistic algorithms” and [0078] “Deep Convolutional Neural Network (DCNNs) can be used to assign a label to one or more portions of an image (e.g., a set of pixels creating a regular or irregular shape) that include a given physical asset of interest” Fathi teaches determining a probability of each pixel (using probabilistic algorithms) containing a building (a physical asset) based on the neural network
Edgar and Fathi are combinable see rationale in claim 1.
Regarding Claim 8, the media of claim 1, Edgar does not explicitly teach wherein the method further comprises: receiving traffic infrastructure annotation indicative of traffic infrastructure; training the neural network based on the traffic infrastructure; and determining the traffic infrastructure in the plurality of images.
However, Fathi teaches receiving traffic infrastructure annotation indicative of traffic infrastructure (Fathi, [0030] “a “physical asset of interest” includes transportation infrastructure (e.g., roads, bridges, ground stockpiles) and [0013] “bridges, roads, etc. can also require the acquisition of 2D and/or 3D data derived from sensors when conducting inspection, repair…” Fathi teaches receiving traffic infrastructure (transportation infrastructure) annotation indicative of traffic infrastructure such as roads, bridges, etc. when conducting inspection, repair.
training the neural network based on the traffic infrastructure (Fathi, [0030] “a “physical asset of interest” includes transportation infrastructure (e.g., roads, bridges, ground stockpiles) and [0078] “Deep Convolutional Neural Networks (DCNNs) can be used to assign a label to one or more portions of an image that include a given physical asset of interest” Fathi teaches training the neural network based on the traffic infrastructure, e.g., roads, bridges; and
determining the traffic infrastructure in the plurality of images (Fathi, [0013] “With a human operator managing the image acquisition…bridges, roads, etc. can also require the acquisition of 2D and/or 3D data derived from sensors when conducting inspection, repair…” Fathi teaches determining the traffic infrastructure (bridges, road) in the acquisition images.
Edgar and Fathi are combinable see rationale in claim 1.
Regarding Claim 9, the media of claim 1, Edgar does not explicitly teach wherein the plurality of images comprises one of satellite data, radar data, LiDAR data, scanning laser mapping data, or stereo photogrammetry data.
However, Fathi teaches the plurality of images comprises one of satellite data, LiDAR data (Fathi, [0074] “The selected sensor types and data attainable therefrom can include satellite imagery, airborne LiDAR, terrestrial LiDAR” Fathi teaches images include one of satellite data, LiDAR data.
Edgar and Fathi are combinable see rationale in claim 1.
Regarding Claim 10, a combination of Edgar and Fathi discloses a system (Edgar, [0003] “a system”) for automated classification and labeling of objects in a plurality of images (Edgar, [0040] the machine learning model M1 can include a DNN image analysis model configured to automatically generate an inference classification based on the medical images”), the system comprising:
at least one processor (Edgar, [0003] “a processor”);
a datastore (Edgar, [0003] “a memory that stores computer executable components”); and
one or more non-transitory computer-readable media storing computer-executable instructions that, when executed by the at least one processor, perform a method comprising:
obtaining the plurality of images,
wherein the plurality of images depicts a region of interest and comprises satellite imagery;
reducing the plurality of images to a first subset and a second subset based on similarities between each image of the first subset,
wherein the second subset comprises images of the plurality of images not included in the first subset;
receiving annotations of the objects in a Digital Surface Model (DSM) or orthorectified image created from the first subset;
forward projecting the annotations to each image of the first subset to obtain annotated images;
training a neural network by the first subset including the annotation of the objects; and classifying and labeling the objects in the second subset by the neural network.
Claim 10 is substantially similar to claim 1 is rejected based on similar analyses.
Regarding Claim 11, a combination of Edgar and Fathi discloses the system of claim 10, wherein the method further comprises:
determining the first subset by comparing viewing parameters between each image of the plurality of images;
selecting one or more pairs of the images at the first subset comprising compatible viewing parameters;
determining a digital surface model from the one or more pairs of the images; and
orthorectifying and aggregating the images.
Claim 11 is substantially similar to claims 2 and 3 and 4 and is rejected based on similar analyses.
Regarding Claim 13, a combination of Edgar and Fathi discloses the system of claim 10, wherein the objects are buildings,
wherein the annotation is indicative of shapes of the buildings.
Claim 13 is substantially similar to claim 6 is rejected based on similar analyses.
Regarding Claim 14, a combination of Edgar and Fathi discloses the system of claim 13, wherein the method further comprises training the neural network for building segmentation based on the first subset and classifying and labeling the second subset by: inputting the images of the second subset into the neural network; and determining a probability of each pixel containing a building based on the neural network.
Claim 14 is substantially similar to claim 7 is rejected based on similar analyses.
Regarding Claim 15, a combination of Edgar and Fathi discloses a method (Edgar, [0029] “computer-implemented methods”) of automated classification and labeling of objects in a plurality of images (Edgar, [0040] the machine learning model M1 can include a DNN image analysis model configured to automatically generate an inference classification based on the medical images”), the method comprising:
obtaining the plurality of images, wherein the plurality of images depicts a region of interest and comprises satellite imagery;
reducing the plurality of images to a first subset and a second subset based on similarities between each image of the first subset, wherein the second subset comprises images of the plurality of images not included in the first subset;
receiving annotation of the objects in a Digital Surface Model (DSM) or orthorectified image created from the first subset;
forward projecting the annotations to each image of the first subset to obtain annotated images;
training a neural network by the first subset including the annotation of the objects; and
classifying and labeling the objects in the second subset by the neural network.
Claim 15 is substantially similar to claim 1 is rejected based on similar analyses.
Regarding Claim 17, a combination of Edgar and Fathi discloses the method of claim 15, wherein the objects are buildings, wherein the annotation is indicative of shapes of the buildings.
Claim 17 is substantially similar to claim 6 is rejected based on similar analyses.
Regarding Claim 18, a combination of Edgar and Fathi discloses the method of claim 17, further comprising training the neural network for building segmentation based on the first subset and classifying and labeling the second subset by:
inputting the images of the second subset into the neural network; and determining a probability of each pixel containing the buildings based on the neural network.
Claim 18 is substantially similar to claim 7 is rejected based on similar analyses.
Regarding Claim 19, a combination of Edgar and Fathi discloses the method of claim 15, further comprising: receiving traffic infrastructure annotation indicative of traffic infrastructure; training the neural network based on the traffic infrastructure; and determining the traffic infrastructure in the plurality of images.
Claim 19 is substantially similar to claim 8 is rejected based on similar analyses.
Regarding Claim 20, a combination of Edgar and Fathi discloses the method of claim 15, wherein the plurality of images comprises one of satellite data, radar data, LiDAR data, scanning laser mapping data, or stereo photogrammetry data.
Claim 20 is substantially similar to claim 9 is rejected based on similar analyses.
Claims 5, 12, 16 are rejected under 35 U.S.C. 103 as being unpatentable by Edgar et al. (U.S. 2021/0034920 A1) in view of Fathi et al. (U.S. 2020/0082168 A1) and further in view of Wu et al. (U.S. 2019/0019324 A1).
Regarding Claim 5, the media of claim 3, a combination of Edgar and Fathi does not explicitly teach wherein the method further comprises:
detecting occluded areas and shadows in the digital surface model; and
generating an orthorectified image with mask layers indicating the occluded areas and areas in shadow.
However, Wu teaches detecting occluded areas and shadows in the digital surface model (Wu [0029] “raw texture images (e.g., from aerial or satellite imager) can be used as textures, raw texture images captured by airborne or terrestrial cameras may suffer from heavy shadow, occlusion, image noise/distortion” and [0033] “a “3D Building Model” can refer to a “Digital Surface Model” (DSM), a single surface that includes buildings, ground, etc.” and [0036] “for 3D building models, textures can be photorealistic (projected from aerial) or abstract (simplified representations of the raw textures). Typically, textures of a DSM are generally photorealistic” Wu teaches detecting occluded areas (occlusion) and shadow in the DSM; and
generating an orthorectified image with mask layers indicating the occluded areas and areas in shadow (Wu, [0038] “FIG. 2, in a photorealistic texture 201 of a building wall or facade, some of the wall texture 201 can be obscured by trees, cars or other buildings and colors may be distorted by shadows and reflections 203” and Fig. 4, [0045] “ the system 100 can apply a morphological “open” and “close” operations to the image mask 411 generated” Wu teaches generating an orthorectified image with mask layers e.g., Figs. 2, 4, are orthro/rectangular images with mask layers indicating the occluded areas (closed/black areas) and areas in shadow (white areas).
Edgar, Fathi and Wu are combinable because they are from the same field of endeavor, system and method for image processing and try to solve similar problems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made for modifying the method of Edgar to combine with detecting occluded areas (occlusion) and shadow in the DSM (as taught by Wu) in order to detect occluded areas (occlusion) and shadow in the DSM because Wu can provide detecting occluded areas (occlusion) and shadow in the DSM (Wu, [0033], [0036], [0039]). Doing so, it may provide efficiently creating textures which realistically represent real world objects such as buildings (Wu, [0001]).
Regarding Claim 12, a combination of Edgar, Fathi and Wu discloses the system of claim 11, wherein the method further comprises:
detecting occluded areas and shadows in the digital surface model; and
generating an image with mask layers indicating the occluded areas and the shadows.
Claim 12 is substantially similar to claim 5 is rejected based on similar analyses.
Regarding Claim 16, a combination of Edgar, Fathi and Wu discloses the method of claim 15, further comprising:
detecting occluded areas and shadows; and
generating an image with a mask layer indicating occluded areas and shadows.
Claim 16 is substantially similar to claim 5 is rejected based on similar analyses.
Conclusion
The prior arts made of record and not relied upon are considered pertinent to applicant's disclosure Fuchs et al. (U.S. 2019/0286936 A1) and Hamid et al. (U.S. 2016/0026848 A1).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KHOA VU whose telephone number is (571)272-5994. The examiner can normally be reached 8:00- 4:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached at 571-272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/KHOA VU/Examiner, Art Unit 2611
/KEE M TUNG/Supervisory Patent Examiner, Art Unit 2611