DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of Claims
This Office Action is in response to the application filed on 11/21/2025. Claims 11-16 were previously canceled. Claim 1 has been amended. Claim 20 is newly added. Claims 1-10 and 17-20 are presently pending and are presented for examination.
Response to Amendments
The amendment filed on 11/21/2025 has been entered.
Response to Arguments
Applicant's arguments filed 11/21/2025 have been fully considered and are persuasive, however
In response to applicant's argument that the references fail to show certain features of applicant's invention, it is noted that the features upon which applicant relies (i.e., "the enriched tensor data structure and the enriched tensor data structure comprises storage for color image data, point cloud data, depth data, and inertial data; and further elaborated color image data in the form ofRGB images, depth data in the form of depth frames, and inertial data in the form of IMU values") are not recited in the rejected claim(s). Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181,26 USPQ2d 1057 (Fed. Cir. 1993).
Applicant’s arguments rely on language solely recited in preamble recitations in claim(s) 6. When reading the preamble in the context of the entire claim, the recitation “wherein the enriched tensor data structure additionally comprises a plurality of time-divided frames, each frame comprising” is not limiting because the body of the claim describes a complete invention and the language recited solely in the preamble does not provide any distinct definition of any of the claimed invention’s limitations. Thus, the preamble of the claim(s) is not considered a limitation and is of no significance to claim construction. See Pitney Bowes, Inc. v. Hewlett-Packard Co., 182 F.3d 1298, 1305, 51 USPQ2d 1161, 1165 (Fed. Cir. 1999). See MPEP § 2111.02.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to ATA 35 U.S.C. 102 and 103 is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
Determining the scope and contents of the prior art.
Ascertaining the differences between the prior art and the claims at issue.
Resolving the level of ordinary skill in the pertinent art.
Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-2, 6-9, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Pub. No. 20210262204 (hereinafter, "Tafazoli Bilandi"; previously of record) in view of U.S. Pat. No. 11268881 (hereinafter, "Finn"; previously of record).
Regarding claim 1, Tafazoli Bilandi discloses an Al-based monitoring system for use with detecting the condition of a shovel during mineral loading in a mining operation, the system comprising:
one or more sensors (Fig. 2B, #128);
further comprising storage (Fig. 2B, #202) for color image data (“The I/O 204 further includes an image signal input port 230 for receiving image signals from the image sensor 128. In one embodiment the image sensor 128 may be a digital camera and the image signal port 230 may be an IEEE 1394 (firewire) port, USB port…” (para 0072)), point cloud data (“In some embodiments, the image sensor 128 may be implemented as a stereoscopic camera capable of using stereo-imaging to obtain a 3D point cloud of the scene” (para 0074)), depth data (“The three-dimensional coordinates have the advantage of providing sizing and depth information associated with regions of interest” (para 0074)), and inertial data (“In some embodiments, motion sensors such as an inertial sensor may be placed on a moving arm (for example the crowd 122 of the mining shovel 104 shown in FIG. 1A) to provide signals that may be used to determine a current operational state of the mining shovel” (para 0113));
a weighing mechanism (“In general the network 600 is initially configured to set weights w.sub.i to some initial value… A minimization algorithm, such as a batch stochastic gradient descent minimization, is then applied to determine new values for the weights w.sub.i. This step generally involves determining the gradient of the cost function using a backpropagation algorithm to adjust the weights w.sub.i until the generated probability map” (para 0103));
one or more outputs (Fig. 6, #634);
an artificial intelligence module, further comprising a plurality of neural networks adapted to detect a plurality of objects of interest (Fig. 6, #600); and
one or more foundational models adapted to detect objects of interest ((Fig. 8, #800) and “The neural network 600 further includes an output layer 634 that includes a plurality of neurons that produces probabilities p.sub.j that the image pixel 610 in the patch 612 corresponds to specific region of interest” (para 0102)).
However, Tafazoli Bilandi does not explicitly teach
an enriched tensor data structure,
Finn, in the same field of endeavor, teaches
an enriched tensor data structure (“organizing, by the processor, the sensor data into a tensor” (Col. 2, lines 14-20)),
One of ordinary skill in the art, before the time of filing, would have been motivated to modify the disclosure of Tafazoli Bilandi with the teachings of Finn in order to separate, by the processor, the tensor into at least one of a normal part and an abnormal part; see Finn at least at (Col. 2, lines 14-20).
Regarding claim 2, Tafazoli Bilandi discloses the system of claim 1. Additionally, Tafazoli Bilandi discloses wherein the sensors additionally comprise:
one or more color image cameras (Fig. 2B, #128);
one or more stereoscopic cameras (“In some embodiments, the image sensor 128 may be implemented as a stereoscopic camera capable of using stereo-imaging to obtain a 3D point cloud of the scene” (para 0074)); and
one or more inertial measurement units (“In some embodiments, motion sensors such as an inertial sensor may be placed on a moving arm (for example the crowd 122 of the mining shovel 104 shown in FIG. 1A)” (para 0113)).
However, Tafazoli Bilandi does not explicitly teach
one or more LIDAR sensors;
Finn, in the same field of endeavor, teaches
one or more LIDAR sensors (“Various depth sensing sensor technologies and devices include, but are not limited to, a structured light measurement, phase shift measurement, time of flight measurement, stereo triangulation device, sheet of light triangulation device, light field cameras, coded aperture cameras, computational imaging techniques, simultaneous localization and mapping (SLAM), imaging radar, imaging sonar, echolocation, laser radar, scanning light detection and ranging (LIDAR), flash LIDAR” (Col. 4, lines 42-50));
One of ordinary skill in the art, before the time of filing, would have been motivated to modify the disclosure of Tafazoli Bilandi with the teachings of Finn in order to provide sensor can be a one-dimensional (1D), 2D, or 3D camera or camera system; a 1D, 2D, or 3D depth sensor or depth sensor system; and/or a combination and/or array thereof; see Finn at least at (Col. 4, lines 31-35).
Regarding claim 6, Tafazoli Bilandi discloses the system of claim 2. However, Tafazoli Bilandi does not explicitly teach wherein the enriched tensor data structure additionally comprises a plurality of time-divided frames, each frame comprising: Color image data in the form of RGB images, captured by color image cameras; point-cloud data captured by LIDAR sensors; depth data in the form of depth frames, captured by stereoscopic cameras; and inertial data in the form of IMU values, captured by inertial measurement units.
Finn, in the same field of endeavor, teaches
wherein the enriched tensor data structure additionally comprises a plurality of time-divided frames, each frame comprising: Color image data in the form of RGB images, captured by color image cameras; point-cloud data captured by LIDAR sensors; depth data in the form of depth frames, captured by stereoscopic cameras; and inertial data in the form of IMU values, captured by inertial measurement units (“wherein the 3D (depth) data from sensor(s) 12 comprises a frame of depth information arranged as a 3-dimensional (x,y,z) depth tensor, for example as an occupancy grid, a tensor-based extension of the matrix-based RPCA process may be used. In this case, the sensor frames may be arranged as successive 3-dimensional sub-arrays of a 4-dimensional tensor” (Col. 9, lines 14-40)).
One of ordinary skill in the art, before the time of filing, would have been motivated to modify the disclosure of Tafazoli Bilandi with the teachings of Finn in order to provide 4-dimensional tensor; see Finn at least at (Col. 9, lines 14-40).
Regarding claim 7, Tafazoli Bilandi discloses a method of detecting the condition of a shovel during mineral loading in a mining operation, the method comprising:
defining one or more regions of interest where an object of interest to be found (“The template includes information based on a physical extent and shape of an object within the region of interest. For the example of the toothline 108 in FIG. 7, the template information may define a rectangular block” (para 0105));
processing one or more results of the artificial intelligence module with a weighing mechanism (“In general the network 600 is initially configured to set weights w.sub.i to some initial value… A minimization algorithm, such as a batch stochastic gradient descent minimization, is then applied to determine new values for the weights w.sub.i. This step generally involves determining the gradient of the cost function using a backpropagation algorithm to adjust the weights w.sub.i until the generated probability map” (para 0103)); and
generating an alert based on the results of the weighing mechanism (“...For example, a single deviation d.sub.i may be initially ignored, but several consecutive deviations in d.sub.i increase the level of confidence in making the determination at block 1116 that an alert should be initiated” (para 0124)).
However, Tafazoli Bilandi does not explicitly teach
building an enriched tensor data structure based on time-divided frames;
processing the enriched tensor with an artificial intelligence module;
Finn, in the same field of endeavor, teaches
building an enriched tensor data structure based on time-divided frames (“organizing, by the processor, the sensor data into a tensor” (Claim 3));
processing the enriched tensor with an artificial intelligence module (“separating, by the processor, said tensor into at least one of a normal part and an abnormal part” (Claim 3));
One of ordinary skill in the art, before the time of filing, would have been motivated to modify the disclosure of Tafazoli Bilandi with the teachings of Finn in order to separate, by the processor, the tensor into at least one of a normal part and an abnormal part; see Finn at least at (Col. 2, lines 14-20).
Regarding claim 8, Tafazoli Bilandi discloses the method of claim 7. Additionally, Tafazoli Bilandi discloses wherein defining one or more regions of interest additionally comprises one or more of:
defining a preset region (“The method may involve training the first neural network to identify regions of interest ” (para 0011));
defining a rectangular region within a camera's field of vision (“The template includes information based on a physical extent and shape of an object within the region of interest. For the example of the toothline 108 in FIG. 7, the template information may define a rectangular block” (para 0105));
defining a region bound by two horizontal lines within a camera's field of vision; defining a region bound by two vertical lines within a camera's field of vision (“The labeled training image 706 may have been generated by a human operator examining the image to identify a tooth region of interest, which is then marked by a bounding box 708 and saved along with the labelling information as a labeled training image” (para 0103));
defining a region based on the depth of certain areas within a camera's field of vision (“Processing the plurality of images may involve processing the images to determine three-dimensional depth information associated with pixels in the images and identifying of regions of interest may be based on the three-dimensional depth information” (para 0033));
defining a segment of time (“In other embodiments, a best image over a period of time, e.g. 5 minutes having the clearest landmarks identified is transmitted to the remote processor 136 for further processing” (para 0128));
defining a region based on object detections by determining a center point of the objects and applying a minimum and maximum distance to the center point (“In one embodiment block 1114 directs the microprocessor 200 to calculate a distance d.sub.i between the prior tooth tip location and the current tooth tip location for each tooth where i is the assigned tooth number. The distances d.sub.i are used by the microprocessor 200 to determine how long the current detected tooth tips are compared to the prior tooth tips. Block 1116 then directs the microprocessor 200 to determine whether the distances d.sub.i are less than a missing tooth threshold” (para 0122)); and
defining a region surrounded by a detected object (“Templates for various regions of interest may be saved in the memory 202 or mass storage unit 208 for performing a subsequent template matching process. The template includes information based on a physical extent and shape of an object within the region of interest” (para 0105));
defining a region based on some inertial data (“In one embodiment the motion of each designated region of interest may be estimated by a Kalman filter implemented by the microprocessor 200 that predicts a location of the designated region in successive images. If at block 1010 the predicted location is determined by the microprocessor 200 to be proximate an expected range of positions position within the image” (para 0122) and “motion sensors such as an inertial sensor may be placed on a moving arm (for example the crowd 122 of the mining shovel 104 shown in FIG. 1A) to provide signals that may be used to determine a current operational state of the mining shovel. For example, back and forth movements of the crowd 122 may be indicative of active loading thus increasing the likelihood that tooth damage may occur. Similarly, sensors on the crowd 122 may indicate whether the operating implement 106 and crowd are moving or static” (para 0113)), wherein
the regions and segments defined represent an area where an object of interest is likely to be found (“causing the embedded processor to perform a correlation between the probability map and a template associated with the particular region of interest, the template including information based on a physical extent and shape of an object within the region of interest” (claim 2)).
Regarding claim 9, Tafazoli Bilandi discloses the method of claim 7. Additionally, Tafazoli Bilandi discloses wherein building the enriched tensor data structure additionally comprises:
defining a region of interest based on preset inputs or analyzing the collected raw data; and creating a data structure containing normalized raw data from within the region of interest (“The method may involve training the first neural network to identify regions of interest by (a) performing a first training of a neural network having a first plurality of interconnected neurons using a first plurality of labeled images of critical regions and non-critical regions” (para 0011)).
However, Tafazoli Bilandi does not explicitly teach
collecting raw data from at least an inertial measurement unit, stereoscopic cameras, color image cameras, and LIDAR;
Finn, in the same field of endeavor, teaches
collecting raw data from at least an inertial measurement unit, stereoscopic cameras, color image cameras, and LIDAR ((Fig. 1, #12) “In various embodiments, sensor 12 may include a structured light line sensor…” (Col. 4, lines 40-45));
One of ordinary skill in the art, before the time of filing, would have been motivated to modify the disclosure of Tafazoli Bilandi with the teachings of Finn in order to provide sensor can be a one-dimensional (1D), 2D, or 3D camera or camera system; a 1D, 2D, or 3D depth sensor or depth sensor system; and/or a combination and/or array thereof; see Finn at least at (Col. 4, lines 31-35).
Regarding claim 19, Tafazoli Bilandi discloses the system of claim 1. Additionally, Tafazoli Bilandi discloses wherein the system is located proximate to the sensors (Fig. 2B, #200).
Claims 10 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Pub. No. 20210262204 (hereinafter, "Tafazoli Bilandi"; previously of record) in view of NPL Document "Data augmentation for improving deep learning in image classification problem" (hereinafter, "Mikołajczyk"; previously of record).
Regarding claim 10, Tafazoli Bilandi a method of training an artificial intelligence module for determining a condition of a shovel during mineral loading in a mining operation, comprising:
wherein the training images depict wear parts in damaged, unattached, or undamaged states, and wherein the data augmentation is adapted to manipulate raw data from visual cameras, LIDAR, environmental sensors, and IMU (“It is generally desirable to have a sufficient number labeled training images under different lighting and other conditions, differing scale, and differing types of operating implements and wear parts” (para 0103));
receiving a set of predicted labels (“evaluating performance of the neural network based on results obtained for processing of a second plurality of labeled images” (para 0011));
applying an intersection of a set of testing labels to the predicted labels (“Determining that the correlation between the probability map and the template meets a criterion may involve selecting successive portions of the probability map based on the template, computing a correlation value for each successive portion, and identifying the portion of the image for further processing based on identifying one of the successive portions of the probability map that has a maximum correlation value that also meets the correlation criterion” (para 0010) and “pruning neurons having a weighting below a pruning threshold to produce a pruned neural network having a second plurality of interconnected neurons, (d) re-evaluating the performance of the pruned neural network having the second plurality of interconnected neurons based on results obtained for processing the second plurality of labeled images,” (para 0011));
generating a set of error values; and adjusting the weights of the neural network by applying the error values (“During training of the neural network 600, the probability map 702 produced by the neural network 600 is compared to the labeled training image 706 and errors are computed for each pixel and fed back to the neural network to make adjustments to the weights w.sub.i. Initially when the weights w.sub.i are set to their initial value” (para 0103)).
However, Tafazoli Bilandi does not explicitly teach
introducing a batch of training images that have been subjected to data augmentation to a neural network,
Mikołajczyk, in the same field of endeavor, teaches
introducing a batch of training images that have been subjected to data augmentation to a neural network (“the current research about so called adversarial attacks on CNNs showed that deep neural networks can be easily fooled into misclassification of images just by partial rotations and image translation [1], adding the noise to images [5] and even changing one, skillfully selected pixel in the image [6]. Increasing the dataset size via data augmentation and image synthesis make it generally more robust and less vulnerable for the adversarial attacks” (page 1, Col. 2)),
One of ordinary skill in the art, before the time of filing, would have been motivated to modify the disclosure of Tafazoli Bilandi with the teachings of Mikołajczyk in order to Increasing the dataset size via data augmentation; see Mikołajczyk at least at (page 1, Col. 2).
Regarding claim 18, Tafazoli Bilandi the method of claim 10. However, Tafazoli Bilandi does not explicitly teach wherein data augmentation additionally comprises one or more of:
mirroring a training image;
tilting horizontally a training image by up to 15-degrees;
zooming in or out of a training image; and
adjusting the contrast of a training image.
Mikołajczyk, in the same field of endeavor, teaches
wherein data augmentation additionally comprises one or more of:
mirroring a training image; tilting horizontally a training image by up to 15-degrees; zooming in or out of a training image; and adjusting the contrast of a training image ((Fig. 1, # Same image after different types of affine transformations) and (Fig. 2, #Same image after different types of color transformations)).
One of ordinary skill in the art, before the time of filing, would have been motivated to modify the disclosure of Tafazoli Bilandi with the teachings of Mikołajczyk in order to increase the number of samples for training the deep neural models [8], to balance the size of datasets [9] as well for their efficiency improvement; see Mikołajczyk at least at (page 2, Col. 1).
Allowable Subject Matter
Claims 3-4, 5, 17, and 20 are objected to as being dependent upon a rejected base claim, but would be allowable over the prior art and may be found allowable after the above objections and/or rejections corresponding to the claims are remedied and the claim(s) re-written in independent form, including all of the limitations of the corresponding independent claim(s) and any intervening claim.
The following is a statement of reasons for the indication of allowable subject matter:
The primary reason for allowance of Claims 5 and 20 in the instant application is because the prior arts of record fails to teach the overall combination as claimed.
Claim 3 recites "one or more neural networks are configured to process only color images; one or more neural networks are configured to process only point clouds; one or more neural networks are configured to process only depth maps; one or more neural networks (Fig. 6, #600) are configured to process only inertial data; one or more foundational models are configured to holistically process color images, point clouds, depth maps, and inertial data; and the one or more neural networks and foundational models each return a result in the form of a predicted object label and a confidence level".
Claim 5 recites "wherein the foundational models additionally comprise: one or more vision transformers".
Claim 20 recites "one or more vision transformers that process data from the enriched tensor in its entirety".
The prior art of record including the disclosures neither anticipates nor renders obvious the above recited combination.
As allowable subject matter has been indicated, applicant's reply must either comply with all formal requirements or specifically traverse each requirement not complied with. See 37 CFR 1.111(b) and MPEP~ 707.07(a).
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ADAM ALHARBI whose telephone number is 313-446-6621. The examiner can normally be reached on M-F 10am-6:30pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abby Flynn can be reached on (571) 272-9855. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8406.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ADAM M ALHARBI/Primary Examiner, Art Unit 3663