DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The IDS(s) has/have been considered and placed in the application file.
Election/Restrictions
Claims 1-20 are directed to an inventions that are independent or distinct from the invention originally claimed for the following reasons:
Group I. Claim 1-8, drawn to training a system to dynamically identify a surgical tray and
items contained thereon , classified in G06N3/08.
Group II. Claim 9-20, drawn to the use of a system for dynamically identifying a surgical
tray and items contained thereon, classified in G06V2201/034.
Claims 9-20 are directed to an invention of Group I that was not elected in the parent application 17/693,563 (see action filed 5/30/2023) which was elected without traverse in the reply filed 5/30/2023.
All claims that the examiner finds are not directed to the elected invention (Claims 9-20) are considered withdrawn from further consideration by the examiner in accordance with 37 CFR 1.142(b). Accordingly, claims 9-20 have not been further treated on the merit. See MPEP § 821.01 through § 821.04. Claims 1-8 have been examined below.
Applicant may cancel or withdraw the claim(s).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kumar et al. (US 12,285,298 B2 – hereinafter “Kumar”) in view of Wong et al. ("Synthetic dataset generation for object-to-model deep learning in industrial applications" – hereinafter “Wong”).
Claim 1.
Kumar a method for training a computer system to dynamically identify a surgical tray and items contained thereon (C6:L55-61 "the image accessor 210 accesses ... a first
Image … depicts a set of instruments … available for use in the procedure ... the first image may be captured by the camera 240 of the device 130 (e.g., by taking a digital photograph
of a surgical tray in which a set of surgical instruments has been arranged in preparation for a surgical procedure to be performed by surgeon)"; C2:L7-9 “... facilitate detection, classification, identification, and tracking of instruments ... "), the method comprising:
a. (C9:L 13-16: "For rendering synthetic images, the trainer machine may access ... 3D models (e.g., computer-aided design (CAD) models) of different surgical instruments,");
b. revising the preliminary 3-dimensional model of the surgical instrument to create a final 3-dimensional synthetic item by defining at least one element selected from the group consisting of: geometry, position of each vertex, UV position of each texture coordinate vertex, vertex normals, faces that make each polygon defined as a list of vertices, and texture coordinates for the item (C9:L 18: " ... surface texture information for the surgical instruments,"; Wong also teaches the data structure limitation "This generates a mesh file in the .obj format, which contains positions of vertices in space, its UV coordinate and polygons in the form
of vertex lists, and a texture .jpeg file." (Page 9));
c. assigning a unique identification to the final 3-dimensional synthetic item, wherein the final 3-dimensional synthetic item with the unique identification is (C2:L40-50 discloses identify a specific individual object (e.g., along with detection, classification, or both), such as a particular instrument; C8:L59-61: “Appropriate corresponding data labels (e.g. the bounding box and the segmentation mask) may be automatically generated ... ");
d. creating at least one hundred unique training synthetic images of the final 3-dimensional synthetic item ("In certain example embodiments, the object identifier ... is trained
using a large synthetic dataset ... " (C8:L42-44)), each of the unique training synthetic images differs from the final 3-dimensional synthetic item by randomly varying at least one element of the final 3-dimensional synthetic item ("automatically generate a large and diverse synthetic dataset of images." (C8:L59-60)) selected from the group consisting of: orientation of the final 3-dimensional synthetic item ("The system randomly changes (e.g., via domain randomization) the location, orientation or pose of the 3D objects ... " (C8:L55-57)), synthetic light color or intensity illuminating the final 3-dimensional synthetic item (" ... changes ... the lighting ... " (C8:L57) and " ... ranges of parameters for defining lighting ... such as the brightness range or the ranges of spectral composition variations." (C9:L 19-21)), and elevation of the final 3-dimensional synthetic item above an identified surface (" ... ranges of possible camera locations .. . (e.g., azimuth, elevation, pan, tilt, etc.) ... " (C9:L22-23)), wherein each of the at least one hundred unique training synthetic images is linked to the unique identification of the final 3-dimensional synthetic item (Kumar teaches as the machine generates the synthetic image, it simultaneously generates the “link” to the correct identification (the label/mask). “Appropriate corresponding data labels ( e.g. the bounding box and the segmentation mask) may be automatically generated by the trainer machine during the simulation, thereby reducing labeling cost” C8:L59-62; Wong confirms the scale (100+ images) and parameters "Using 100K synthetic images for 10 classes ... " (Page 1); "Camera location was defined to be evenly distributed around rings … defined via an azimuth and elevation ." (Page 5));
e. creating at least one unique test synthetic image of the final 3-dimensional synthetic item, the unique test synthetic image differs from the final 3- dimensional synthetic item by randomly varying at least one element of the final 3-dimensional synthetic item selected from the group consisting of: orientation of the final 3-D synthetic item, synthetic light color or intensity illuminating the final 3-D synthetic item, and elevation of the final 3-D synthetic item above an identified surface (“the trainer machine randomly places 3D objects in the scene … with arbitrary orientations and positions.” (C8:L49-52, C9:L27-30); "The system randomly changes (e.g., via domain randomization) the location, orientation or pose of the 3D objects ... " (C8:L55-57)), (Kumar implies testing via metrics. "The object identifier may evaluate bounding box metrics using Pascal VOC metrics." (C8: L40-41));
f. synthetic item ("Step 2: The trainer machine trains the object identifier by using the synthetic surgical instrument dataset generated in Step 1 as the training dataset." (C9:L36-38); " ... deep convolutional neural network ... classification training model ... " (C7:L55-65));
g. processing the synthetic test images with the system so the system identifies the 3-dimensional synthetic item from the synthetic test images based on the identification model, and the system provides a numeric confidence factor representing confidence that the system has correctly identified the 3-dimensional synthetic item (" ... obtain an answer, which may be provided a likelihood score indicating a level of confidence in the answer." (C14:L23-24)); and
h. determining if the numeric confidence factor is equal to or greater than a pre-set confidence factor uploaded into the system and repeating steps d through h if the identification is not correct or the numeric confidence factor is less than the pre-set confidence factor (Kumar teaches evaluating metrics and further training. " ... evaluate bounding box metrics ... " (C8:L40) and "After pre-training with this synthetic dataset, the trainer machine modifies (e.g., by further training) the object identifier ... " (C9:L3-5)).
Kumar discloses all of the subject matter as described above except for specifically teaching “a. scanning a surgical instrument with a scanner device at least two times to create a preliminary,” “stored in a database,” “wherein the unique test synthetic image is not linked to the unique identification of the final 3-dimensional synthetic item”, and “repeatedly.” However, Wong in the same field of endeavor teaches “a. scanning (p. 3: “3D Modelling: This stage involves scanning physical products to produce 3D models."; p. 4: “Minimal human effort is required – typically no more than 40 images were required per model.”; p. 8 “Images were captured at two levels ... roughly 30° and 60° elevation."), “stored in a database” (Wong also teaches this limitation in the Abstract: "The image generation process supports automatic pixel annotation” This includes Product Class Label (Page 4, Fig. 1) and their associated databases.),
PNG
media_image1.png
279
622
media_image1.png
Greyscale
“wherein the unique test synthetic image is not linked to the unique identification of the final 3-dimensional synthetic item” (Wong teaches the specific use of a validation set (test images). "The rendered images were split into a train and validation set. The validation split of the rendered images is used to measure the deviation in network performance ... " (Page 13).) and “repeatedly” (Wong teaches a system that should be repeatedly tuned (optimized) if the performance is not satisfactory. “The system described here, which performs both data generation and model training, can be thought of as a black box with tunable hyperparameters that include both rendering and network training hyperparameters” (Page 16). “ … an appropriate optimization procedure (e.g., a greedy sequential search, or
Bayesian optimization … may be able to efficiently find a suitable lighting distribution in our rendering framework which maximizes performance of the model by optimizing network training, as well as optimizing the training data itself.” (Page 16); Where, “optimizing the training data” by finding a suitable lighting distribution requires repeating step d (creating new synthetic images with different lighting/orientation settings and “maximizing performance” requires checking if the current performance meets a standard (step h: determining if the confidence is sufficient) and, if not, running the optimization procedure again.).
Therefore, it would have been obvious to one of ordinary skill in the art to combine Kumar and Wong before the effective filing date of the claimed invention. Kumar discloses an instrument tracking machine that utilizes an object identifier trained using a "large synthetic dataset" to avoid problems with small real-world datasets. Kumar teaches simulating 3D scenes, placing 3D models of instruments, and randomly varying parameters such as "lighting," "orientation ," and "camera location" (azimuth/elevation) to generate training data (C8:L42-44, C8:L55-60, C9:L 19-23). Kumar further teaches that the system provides a "likelihood score indicating a level of confidence" in its identification (C14:L23-24). However, Kumar primarily discusses using "3D models (e.g., computer-aided design (CAD) models)" (C9:L15) and does not explicitly detail the step of scanning a physical instrument multiple times to create that model, nor the specific mesh data structures (vertices, UVs) defined in the claim. Furthermore, while Kumar teaches generating data, it describes a generally linear process.
Wong teaches a method for generating synthetic datasets where the 3D models are created via photogrammetry (scanning a physical object multiple times) rather than just CAD. Wong further details the specific mesh revision steps (defining vertices, UV coordinates) (Page 9). Wong teaches that this process is not a one-time event but a "black box" optimization loop. Wong teaches that if a model's performance is "not … robust" or "deviates" from reality, "further optimization" of the training data itself is required (Page 14-16).
It would have been obvious to one of ordinary skill in the art to modify the synthetic data generation method of Kumar by utilizing the scanning/photogrammetry techniques of Wong to create the necessary 3D models. The motivation would be to allow the system to be trained on instruments for which no CAD files exist or to capture realistic "surface texture information" (as desired by Kumar C9:L18) from actual physical specimens that may show wear or specific manufacturing traits.
It would also have been obvious to one of ordinary skill in the art to modify Kumar by incorporating Wong's scanning and iterative optimization pipeline. The motivation is grounded in the need to ensure the "likelihood score" (confidence) taught by Kumar is sufficiently high for surgical safety. As Wong teaches, a system cannot "maximize performance" or decide that a model is "not robust" without implicitly comparing the results to a standard. Therefore, a skilled artisan would be motivated to use Wong's "further optimization" loop-repeating the synthetic data generation-whenever the "likelihood score" from Kumar falls below a pre-set confidence factor (e.g., a required safety threshold). This combination allows the system to automate the decision of whether to deploy the model or "repeat" the training to achieve the high accuracy required for surgical tracking.
Claim 2.
The combination of Kumar and Wong discloses the method of claim 1, at least one of steps b through h are accomplished automatically without user input (Kumar "automatically generate a large and diverse synthetic dataset" (C8: L59)).
Claim 3.
The combination of Kumar and Wong discloses the method of claim 2, wherein steps d through h are accomplished automatically using a computer vision-driven artificial intelligence network (Kumar teaches "instrument classifier may be or include a deep convolutional neural network" (C7:L54-55)).
Claim 4.
The combination of Kumar and Wong discloses the method of claim 2, wherein the computer vision-driven artificial intelligence network is a convolutional neural network (Kumar teaches "instrument classifier may be or include a deep convolutional neural network" (C7:L54-55)).
Claim 5.
The combination of Kumar and Wong discloses the method of claim 1 further comprising:
i. when the system correctly identifies the 3-dimensional synthetic item and the numeric confidence factor is equal to or greater than a pre-set confidence factor, uploading the identification model attributable to the unique identification to a server for deployment (Kumar teaches "accessed ... from the database 115 via the network 190" (C6:L63-64)).
Claim 6.
The combination of Kumar and Wong discloses the method of claim 1, wherein step d continuously generates new unique training synthetic images of the final 3-dimensional synthetic item by varying at least one element (Kumar teaches "continuously running the
instrument recognizer" (C16:L53-54) and Wong teaches that the rendering stage
produces an "infinite supply" of data (Page 5)) selected from the group consisting of: orientation of the final 3-D synthetic model, synthetic light color or intensity illuminating the final 3-D synthetic model (Kumar "The system randomly changes (e.g., via domain randomization) the location, orientation or pose of the 3D objects ... " (C8:L55-57), and elevation of the final 3-D synthetic model above an identified surface (Kumar " ... ranges of possible camera locations .. . (e.g., azimuth, elevation, pan, tilt, etc.) ... " (C9:L22-23), wherein each new unique training synthetic image is linked to the unique identification of the final 3-dimensional synthetic item Kumar teaches as the machine generates the synthetic image, it simultaneously generates the “link” to the correct identification (the label/mask) C8:L59-62).
Claim 7.
The combination of Kumar and Wong discloses the method of claim 1, wherein step e continuously generates new unique test synthetic images of the final 3-dimensional synthetic item (Kumar teaches "continuously running the instrument recognizer" (C16:L53-54) and Wong teaches that the rendering stage produces an "infinite supply" of data (Page 5)) by varying at least one element selected from the group consisting of: orientation of the final 3-D synthetic model (Kumar "The system randomly changes (e.g., via domain randomization) the location, orientation or pose of the 3D objects ... " (C8:L55-57)), synthetic light color or intensity illuminating the final 3-D synthetic model (Kumar " ... changes ... the lighting ... " (C8:L57) and " ... ranges of parameters for defining lighting ... such as the brightness range or the ranges of spectral composition variations." (C9:L 19-21)), and elevation of the final 3-D synthetic model above an identified surface (Kumar " ... ranges of possible camera locations .. . (e.g., azimuth, elevation, pan, tilt, etc.) ... " (C9:L22-23), wherein the new unique test synthetic images are not linked to the unique identification of the final 3-dimensional synthetic item (Wong teaches the specific use of a validation set (test images). "The rendered images were split into a train and validation set. The validation split of the rendered images is used to measure the deviation in network performance ... " (Page 13)).
Claim 8.
The combination of Kumar and Wong discloses the method of claim 1, wherein the numeric confidence factor is greater than 95 percent (Wong achieves "accuracy of 96%" (Page 1)).
Conclusion
The prior art made of record but not relied, yet considered pertinent to the applicant’s disclosure, is listed on the PTO-892 form.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ross Varndell whose telephone number is (571)270-1922. The examiner can normally be reached M-F, 9-5 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, O’Neal Mistry can be reached at (313)446-4912. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Ross Varndell/Primary Examiner, Art Unit 2674