DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Applicant’s election without traverse of Species I.a in the reply filed on 12/19/2025 is acknowledged. The application has pending claim(s) 1-27 [withdrawn claims 5-8, 11-19, and 22-26 are withdrawn from further consideration].
Claim Objections
Claim 3 is objected to because of the following informalities:
Claim 3: The formula in claim 3 is too blurry. A clear / non-blurry version should be provided.
Appropriate correction is required.
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-4, 9-10, 20-21, and 27 is/are rejected under 35 U.S.C. 103 as being unpatentable over Santin (“FAST VISUAL GROUNDING IN INTERACTION – Bringing few-shot learning with neural networks to an interactive robot” – October 2019 – pages 1-33, provided by Applicant’s Information Disclosure Statement – IDS) in view of Snell et al (“Prototypical Networks for Few-shot Learning” – arXiv 2017 – pages 1-13).
Claim 1: Santin discloses a computer-implemented (see Santin, paragraph “Our situated agent setup …” in page 6, paragraph “Our implementation …” in page 8) method, comprising: sectioning at least a portion of a real data set of interest into a grid of chips, each chip comprising a real data subset of the portion of the real data set of interest (see Santin, paragraph “One of the contributions …” at page 11, paragraph “Firstly …” at page 16, paragraphs “Since the images …” and “We use …” at page 20, “saves the image of the mentioned object in the dataset”, “collection of 400 images with a size of 224x224 each”, “k images per each label (1-shot, 5-shot or 10-shot) ... The rest of the images of the same labels are taken as target (t) to evaluate the classification process” [the Examiner notes that the Applicant’s specification defines sub-images to be referred to herein as chips]), and receiving a few user-selected chips corresponding to ground truth examples selected from the portion of the real data set, wherein the selected chips define a support set for a few-shot class prototype (see Santin, paragraph “In each …” at page 11, paragraphs “Firstly …” and “If the number …” at page 16, “the human tutor can present the object”, “requests the human tutor to show it more instances about that category”, “takes a support set S with k labelled images (each one with a size of 224x224 pixels) of each of the n categories of objects”); encoding a latent space representation of the support set using an embedding neural network, and defining the few-shot class prototype of the latent space representation of the support set (see Santin, Fig. 8, the caption of Fig. 5, paragraph “In each …” at page 11, “labelled images of the support set (S) are encoded by the VGG16 convolutional layers and the embeddings processed by the g function”, “All the images are encoded through the VGG16”); and using the embedding neural network, encoding a latent space representation of other chips of the real data set of interest, and, using a few-shot neural network, comparing the latent space representation of the other chips to the few-shot class prototype and assigning few-shot class prototype labels to the other chips based on the comparison to identify features in the real data set of interest that are similar to the few user-selected chips (see Santin, Fig. 8, paragraph “As the main …” in page 9, the caption of Fig. 5, paragraphs “In each …” and “Once the …” at page 11, “takes … a target image t, which is not labelled …”, “target image (t) is also encoded and embedded by its own function f … the matching network computes the cosine similarity between the t and …”, “These results are … presented … belonging to the categories of the images in S …”).
However Santin fails to explicitly disclose where Snell discloses defining the few-shot class prototype as a mean vector of the latent space representation of the support set (see Snell, Section 2.1 and 2.2, compute protype from support examples using the mean vector formula in equation 1).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Santin’s method using Snell’s teachings by including the few shot class mean vector processing to Santin’s few shot class process in order to greatly improve the distance metric results on the matching networks (see Snell, Section 2.1 and 2.2, paragraph “Distance metric …” in page 4).
Re Claim 2: Snell further discloses wherein the encoding the latent space representation of the support set comprises transforming the support set data, having D-dimensionality, into the latent space representation, having an M-dimensionality, through an embedding function
f
φ
having learnable parameters
φ
(see Snell, Section 2.1 and 2.2, compute protype from support examples using the mean vector formula in equation 1). See claim 1 for obviousness and motivation statements.
Re Claim 3: Snell further discloses wherein the support set S for the class prototype k is
S
=
x
1
;
y
1
;
.
.
.
x
N
;
y
N
where xi represents a chip i and yi is the corresponding true class label, the transforming with the embedding function produces transformed chips through
f
φ
x
i
=
z
i
, and the mean vector comprising embedded support points for the class prototype k is defined by:
c
k
=
1
N
S
k
∑
z
i
,
y
i
∈
S
k
z
i
(see Snell, Section 2.1 and 2.2, compute protype from support examples using the mean vector formula in equation 1). See claim 1 for obviousness and motivation statements.
Re Claim 4: Santin further discloses wherein the comparing the latent space representation of the other chips to the few-shot class prototype and assigning few-shot class prototype labels to the other chips based on the comparison (see Santin, Fig. 8, paragraph “As the main …” in page 9, the caption of Fig. 5, paragraph “In each …” at page 11, “takes … a target image t, which is not labelled …”, “target image (t) is also encoded and embedded by its own function f … the matching network computes the cosine similarity between the t and …”) comprises, for each other chip: calculating a distance between the latent space representation of the chip and the few-shot class prototype (see Santin, Fig. 8, paragraph “As the main …” in page 9, the caption of Fig. 5, paragraphs “In each …” and “Once the …” at page 11, “computes the cosine similarity between the t and …” over the n categories so we get one score per each of the categories); normalizing the distance into class probabilities using a softmax (see Santin, Fig. 8, paragraph “As the main …” in page 9, the caption of Fig. 5, paragraphs “In each …” and “Once the …” at page 11, “these results are computed through a Softmax function so they are normalised”); and assigning the few-shot class prototype label to the chip where the few-shot class prototype label has a highest class probability (see Santin, paragraph “Once the …” in page 11, caption of Figure 5, paragraph “Based on the proposal …” in page 17, “the output scores reflect that t …” is most similar to the label with the highest probability score of 0.81).
However Santin fails to explicitly disclose where Snell discloses calculating a Euclidean distance between the latent space representation of the chip and the few-shot class prototype (see Snell, Section 2.1 and 2.2, paragraph “Distance metric …” in page 4, using squared Euclidean distance can greatly improve the results).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Santin’s method using Snell’s teachings by including the squared Euclidean distance metric to Santin’s cosine similarity metric in order to greatly improve the distance metric results on the matching networks (see Snell, Section 2.1 and 2.2, paragraph “Distance metric …” in page 4).
Re Claim 9: Santin further discloses wherein the embedding neural network is an off-the-shelf neural network pre-trained on a data set related or unrelated to the real data set of interest (see Santin, paragraph “Our implemented neural network …” at page 1, Section 4.2, VGG16 already pre-trained).
Re Claim 10: Santin further discloses wherein the few user-selected chips comprises greater than or equal to one and less than or equal to ten user-selected chips (see Santin, paragraph “In each …” at page 11, paragraphs “Firstly …” and “If the number …” at page 16, “the human tutor can present the object”, “requests the human tutor to show it more instances about that category”, “… five …”).
Re Claim 20: Santin further discloses automatically adaptively sampling desired feature types by adjusting data acquisition parameters and acquiring another real data set at chip locations having an assigned few-shot class prototype label (see Santin, paragraph “In front of this …” in page 6, paragraph “According to … data augmentation … by applying automatic creation of images transformed from the ones in the dataset by applying one or more transformations such as vertical and horizontal translation, rotation, or color and contrast changes” in page 3).
Re Claim 21: Santin further discloses wherein the automatically adjusting data acquisition parameters includes adjusting an imaging system movement stage, an imaging system magnification, an imaging system sampling characteristic, an imaging system detector, or an imaging system detector selection (see Santin, paragraph “In front of this …” in page 6, paragraph “According to … data augmentation … by applying automatic creation of images transformed from the ones in the dataset by applying one or more transformations such as vertical and horizontal translation, rotation, or color and contrast changes” in page 3).
Re Claim 27: Santin discloses a computer-implemented method (see Santin, paragraph “Our situated agent setup …” in page 6, paragraph “Our implementation …” in page 8), comprising: sectioning at least a portion of a real data set of interest into a grid of chips, each chip comprising a real data subset of the portion of the real data set of interest (see Santin, paragraph “One of the contributions …” at page 11, paragraph “Firstly …” at page 16, paragraphs “Since the images …” and “We use …” at page 20, “saves the image of the mentioned object in the dataset”, “collection of 400 images with a size of 224x224 each”, “k images per each label (1-shot, 5-shot or 10-shot) ... The rest of the images of the same labels are taken as target (t) to evaluate the classification process” [the Examiner notes that the Applicant’s specification defines sub-images to be referred to herein as chips]), and receiving a few user-selected chips corresponding to ground truth examples selected from the portion of the real data set, wherein the selected chips define a support set for a few-shot class prototype (see Santin, paragraph “In each …” at page 11, paragraphs “Firstly …” and “If the number …” at page 16, “the human tutor can present the object”, “requests the human tutor to show it more instances about that category”, “takes a support set S with k labelled images (each one with a size of 224x224 pixels) of each of the n categories of objects”); and receiving and/or displaying an identification of features in the real data set of interest that are similar to the few user-selected chips (see Santin, Fig. 8, paragraph “As the main …” in page 9, the caption of Fig. 5, paragraphs “In each …” and “Once the …” at page 11, “takes … a target image t, which is not labelled …”, “target image (t) is also encoded and embedded by its own function f … the matching network computes the cosine similarity between the t and …”, “These results are … presented … belonging to the categories of the images in S …”), wherein the identification is produced by: encoding a latent space representation of the support set using an embedding neural network, and defining the few-shot class prototype of the latent space representation of the support set (see Santin, Fig. 8, the caption of Fig. 5, paragraph “In each …” at page 11, “labelled images of the support set (S) are encoded by the VGG16 convolutional layers and the embeddings processed by the g function”, “All the images are encoded through the VGG16”); and using the embedding neural network, encoding a latent space representation of other chips of the real data set of interest, and, using a few-shot neural network, comparing the latent space representation of the other chips to the few-shot class prototype and assigning few-shot class prototype labels to the other chips based on the comparison (see Santin, Fig. 8, paragraph “As the main …” in page 9, the caption of Fig. 5, paragraph “In each …” at page 11, “takes … a target image t, which is not labelled …”, “target image (t) is also encoded and embedded by its own function f … the matching network computes the cosine similarity between the t and …”).
However Santin fails to explicitly disclose where Snell discloses defining the few-shot class prototype as a mean vector of the latent space representation of the support set (see Snell, Section 2.1 and 2.2, compute protype from support examples using the mean vector formula in equation 1).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Santin’s method using Snell’s teachings by including the few shot class mean vector processing to Santin’s few shot class process in order to greatly improve the distance metric results on the matching networks (see Snell, Section 2.1 and 2.2, paragraph “Distance metric …” in page 4).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Yu et al ‘445 discloses a classification system that uses a combined approach in which results from a multi-shot technique (e.g., the trained classification network) and results from a few-shot technique (e.g., a prototype network) are combined to label the data sample.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BERNARD KRASNIC whose telephone number is (571)270-1357. The examiner can normally be reached Mon. - Thur. and every other Friday from 8am - 4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached at (571)272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Bernard Krasnic/Primary Examiner, Art Unit 2671 February 9, 2026