Last updated: April 19, 2026

Application No. 18/557,023

RECOGNITION DEVICE, RECOGNITION METHOD, RECOGNITION PROGRAM, MODEL LEARNING DEVICE, MODEL LEARNING METHOD, AND MODEL LEARNING PROGRAM

Non-Final OA §102§103§112

Filed

May 16, 2024

Examiner

CAMMARATA, MICHAEL ROBERT

Art Unit

2667

Tech Center

2600 — Communications

Assignee

NTT, Inc.

OA Round

1 (Non-Final)

Interview Optional

— +35.9% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 305 resolved cases, 2023–2026

Examiner Intelligence

CAMMARATA, MICHAEL ROBERT View full profile →

Grants 70% — above average

Career Allow Rate

213 granted / 305 resolved

+7.8% vs TC avg

Strong +36% interview lift

Without

With

+35.9%

Interview Lift

resolved cases with interview

Typical timeline

2y 4m

Avg Prosecution

46 currently pending

Career history

351

Total Applications

across all art units

Statute-Specific Performance

§101

4.5%

-35.5% vs TC avg

§103

45.8%

+5.8% vs TC avg

§102

21.1%

-18.9% vs TC avg

§112

24.6%

-15.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 305 resolved cases

Office Action

§102 §103 §112

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 
The following title is suggested: Recognizing Food, Container and Residue To Estimate Area Ratio of Consumed And Leftover Food
Drawings
The drawings are objected to because of poor quality drawings; Figs. 1-3, 5, and 8-12 have poor line quality, blurry text characters, fonts that are too small to read, and unclear/blurry black and white images all of which are poor quality reproductions that do not have satisfactory reproduction characteristics contrary to 37 CFR 1.84(l).  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-8 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
	Claims 1 recites “acquire related information related to a target in a recognition target image which is a post-image obtained through photographing before and after treatment on a container that stores the target”.  This claim element has several clarity issues.  To start with, the disclosed invention requires inputting an image in order to recognize/segment the “recognition target image” but no such element in present in the claim.  The “acquiring” is limited to so-called “related information” while an unclear “to extract” phrase indefinitely refers to extracting the (already acquired) related data and the recognition target image.  
Furthermore, the phrase “a recognition target image which is a post-image obtained through photographing before and after treatment on a container that stores the target” is not intelligible.  How can a single image be “obtained through photographing before and after” some act such as the “treatment”?  This error is repeated later in the claim as “a pre-stored pre-image obtained through the photographing before and after the treatment”.  Also, what is meant by a “pre-stored pre-image”?  Do these terms refer to a single entity or two different images (one of which is pre-stored and the other is a pre-image)?  What is a “pre-image” particularly in reference to photographing before and after the “treatment”.
Moreover, “treatment on a container” is so far abstracted from the disclosed invention as to lose all meaning whatsoever.  The disclosed invention is directed to segmenting an image of a food container such as a dinner plate such that an estimate may be formed regarding the amount of leftover food after a human has eaten.  The machine learning model is trained with images taken before and after a human being eats the food contained on/in the plate.  The abstracted term “treatment” is so wholly divorced from the disclosed act of a human eating food as to be completely unclear and indefinite.
	Claim 1 further recited “estimate a ratio of the target in the recognition target image based on the recognition result and an area ratio in a pre-stored pre-image obtained through the photographing before and after the treatment”.  A ratio requires two elements consisting of numerator and denominator.  As such, to what does a “ratio of the target” refer?  What is the denominator; a ratio of what?  The same issue is repeated for the area ratio; an area ratio of what?  Is the “ratio of the target” different than the “area ratio” and, if not, how does these ratios relate to the “a pre-stored pre-image obtained through the photographing before and after the treatment”?
Independent claims 5 and 7 are parallel to claim 1 and are rejected for the same reasons as above.  Claims 2-4 are rejected due to their dependency upon claim 1.
	Independent claims 4, 6 and 8 indefinitely recite
“accept a learning post-image obtained through photographing before and after treatment on a container that stores a target, a learning mask image corresponding to the post-image, and learning data including related information related to the target as an input” see the rejections above regarding image before and after which is even more unclear reading a “post-image” which infers taken after but the claim nonetheless also specifies photographing the post image before “treatment”.  See also the rejection above regarding “treatment”, 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5, and 7 are rejected under 35 U.S.C. 103 as being unpatentable over Lu {Lu, Ya, et al. "An artificial intelligence-based system to assess nutrient intake for hospitalised patients." IEEE transactions on multimedia 23 (2020): 1136-1147} and Takahashi (US 2015/0206259 A1).
Claim 1
	In regards to claim 1, Lu discloses a recognition device comprising: 
a memory, and at least one processor coupled to the memory {see section V Experimental Results in which the disclosed model was implemented in software running on a computer with a processor to produce the disclosed test results summarized in tables I-V}, the at least one processor being configured to:
acquire related information related to a target in a recognition target image which is a post-image obtained through photographing before and after treatment on a container that stores the target, and to extract recognition target data which is a combination of the recognition target image and the related information {section IIIA Data Capture in which RGB-D image pairs were captured of food trays before and after hospital patients ate their meals while noting that “treatment” corresponds to a person eating food stored in or on a container that stores the target (e.g. food item(s)).  Note that the “related information” may include the Depth image which is related to the food items for estimating area/volume.  See also Section IIIB Data Annotation, Fig. 4 in which the related information” may also be the 7 hyper categories (main course, side dish, vegetable, sauce, etc. and/or the 521 fine grained categories; and/or annotations for the container (plate)/background and types of plates (e.g. main plate, salad bowl, packaged containers, etc.); weight, and/or the nutrient annotations of each food item.  See also section IIB Food Database particularly the annotations (related information) including food categories, food locations for pixel-level food segmentation map}; 
accept the recognition target data as an input to a model learned in advance and output a recognition result obtained by recognizing an area where at least the container, the target, and a portion other than the target are divided by an output of the model
{See Section IV, Fig. 5 (copied below) employing a Multi-task Contextual Network (MTCNet) which is a model learned in advance an outputs a “recognition” result recognizing an area where each food type and plate type occur in the image in order to segment the image into container (plate), target (e.g. one of the food items) and “a portion other than the target” (e.g. a different food items or category such as a sauce.  Note that the BRI of “a portion other than the target” specifically includes a sauce as per [0023] of the instant specification distinguishing between food and leftovers such as sauce}; 

    PNG
    media_image1.png
    754
    1622
    media_image1.png
    Greyscale
}; and
 


    PNG
    media_image2.png
    398
    1650
    media_image2.png
    Greyscale

wherein the model recognizes the area by converting the recognition target image into a feature amount map and calculating the feature amount map in a weighting manner by latent information obtained from the related information {Section IVA including final food prediction segmentation map which is generated by converting the recognition target image into a feature amount map of the Deep CNNs in which calculated in a weighted manner by latent information within the layers that are obtained from the related information (e.g. food type, plate type)}.
	Takahashi is an analogous reference from the same field of computer vision for estimating food amounts including the amount remaining after someone has eaten a meal.  See abstract, Figs. 1, 2, 5 and 7 including remaining amount detection module 104.
	Takahashi also teaches estimating a ratio of the target in the recognition target image based on the recognition result and an area ratio in a pre-stored pre-image obtained through the photographing before and after the treatment {Figs. 6, 7, 8, 12-14 including step S32 and Fig. 9 calculate area ratio S65, [0027]-[0030] camera 8 directed downward to photograph table and plates at predetermined intervals such as every 30 seconds, [0042]-[00wherein the table images including container area 1316, dish (food) area, and an area ratio part 1318 that calculates/estimates a ratio of the target (food) 
See also remaining amount detection module 104 that detects remaining amount of food left in the container [0041] based on the recognition result as per [0057]-[0058] in which color differences and shapes are used to recognize containers and food (dishes).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to have modified Lu’s food/plate segmentation model that already 
outputs a recognition result obtained by recognizing an area where at least the container, the target, and a portion other than the target are divided by an output of the model to include estimating a ratio of the target in the recognition target image based on the recognition result and an area ratio in a pre-stored pre-image obtained through the photographing before and after the treatment as taught by Takahashi because doing so enables wait staff to provide better service to the customer as motivated by Takahashi in [????], because doing so provides a convenient summary measurement of the amount of leftover food using a conventional area ratio, because there is a reasonable expectation of success and/or because doing so merely combines prior art elements according to known methods to yield predictable results.
Claim 2
	In regards to claim 2, Lu discloses wherein, in the model, the related information is configured to be converted into the latent information by a fully combined layer, and an information source of the related information is set as one or more pieces of information {see the Deep (fully connected) convolutional neural network (Deep CNN) in Section IVA, Fig. 5. which the related information is converted into the latent information by a fully combined layer, and an information source of the related information (see mapping of claim 1 for various types of “related information”) is set as one or more pieces of information
Claim 3
	In regards to claim 3, Lu discloses wherein the model performs weighted feature amount map calculation processing on a channel component or a spatial component of the feature amount map {see section IVA including initial feature map on a channel (heightxwidthxchannel) component or spatial component in the segmentation branches}.
Claims 5 and 7
The rejection of device claim 1 above applies mutatis mutandis to the corresponding limitations of method claim 5 and computer readable medium claim 7 while noting that the rejection above cites to both device and method disclosures. For the computer readable storage medium storing program limitations of claim 7 see Lu section V Experimental Results in which the disclosed model was implemented in software stored on a computer readable medium and running on a computer with a processor to produce the disclosed test results summarized in tables I-V}.



Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 4, 6, and 8 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Lu.
Claim 4
	In regards to claim 4, Lu discloses a model learning device comprising: 
a memory; and at least one processor coupled to the memory {see section V Experimental Results in which the disclosed model was implemented in software running on a computer with a processor to produce the disclosed test results summarized in tables I-V}, the at least one processor being configured to:
accept a learning post-image obtained through photographing before and after treatment on a container that stores a target, a learning mask image corresponding to the post-image, and learning data including related information related to the target as an input
{section IIIA Data Capture in which RGB-D image pairs were captured of food trays before and after hospital patients ate their meals while noting that “treatment” corresponds to a person eating food stored in or on a container that stores the target (e.g. food item(s)).  Note that the “related information” may include the Depth image which is related to the food items for estimating area/volume.  See also Section IIIB Data Annotation, Fig. 4 in which the related information” may also be the 7 hyper categories (main course, side dish, vegetable, sauce, etc. and/or the 521 fine grained categories; and/or annotations for the container (plate)/background and types of plates (e.g. main plate, salad bowl, packaged containers, etc.); weight, and/or the nutrient annotations of each food item.  See also section IIB Food Database particularly the annotations (related information) including food categories, food locations for pixel-level food segmentation map (learning mask)};
convert the image into a feature amount map by a model, and calculate the feature amount map in a weighting manner by latent information obtained from the related information to output a mask image in which an area where at least the container, the target, and a portion other than the target are divided is recognized as a recognition result
{See Section IV, Fig. 5 (copied below) employing a Multi-task Contextual Network (MTCNet) which is a model learned in advance to convert the image into a feature amount map recognizing an area where each food type and plate type occur in the image in order to output a mask image (segment the image) into container (plate), target (e.g. one of the food items) and “a portion other than the target” (e.g. a different food items or category such as a sauce.  Note that the BRI of “a portion other than the target” specifically includes a sauce as per [0023] of the instant specification distinguishing between food and leftovers such as sauce.  Further as to output a mask image, see the segmentation map and Fig. 4 illustrating the segmented image (mask image)}; and 
digitize a difference between the mask image of the recognition result and a mask image included in the learning data as a loss and update a parameter of the model to reduce the loss {this passage generally refers to model training that updates model parameter(s) based on loss.  See IVB including equations (1)-(3) and Section VA segmentation network trained with Adadelta optimizer and categorical cross entropy loss}.
Claims 6 and 8
The rejection of device claim 4 above applies mutatis mutandis to the corresponding limitations of method claim 6 and computer readable medium claim 8. For the computer readable storage medium storing program limitations of claim 8 see Lu section V Experimental Results in which the disclosed model was implemented in software stored on a computer readable medium and running on a computer with a processor to produce the disclosed test results summarized in tables I-V}.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael R Cammarata whose telephone number is (571)272-0113. The examiner can normally be reached M-Th 7am-5pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matthew Bella can be reached at 571-272-7778. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MICHAEL ROBERT CAMMARATA/           Primary Examiner, Art Unit 2667

Read full office action

Prosecution Timeline

May 16, 2024

Application Filed

Feb 27, 2026

Non-Final Rejection — §102, §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/498,864

Patent 12602797

RECONSTRUCTION OF BODY MOTION USING A CAMERA SYSTEM

2y 5m to grant Granted Apr 14, 2026

17/955,837

Patent 12586171

METHODS AND SYSTEMS FOR GRADING DEVICES

2y 5m to grant Granted Mar 24, 2026

18/084,283

Patent 12579597

Point Group Data Synthesis Apparatus, Non-Transitory Computer-Readable Medium Having Recorded Thereon Point Group Data Synthesis Program, Point Group Data Synthesis Method, and Point Group Data Synthesis System

2y 5m to grant Granted Mar 17, 2026

18/280,955

Patent 12579835

INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND COMPUTER-READABLE RECORDING MEDIUM FOR DISTINGUISHING OBJECT AND SHADOW THEREOF IN IMAGE

2y 5m to grant Granted Mar 17, 2026

17/879,219

Patent 12567283

FACIAL RECOGNITION DATABASE USING FACE CLUSTERING

2y 5m to grant Granted Mar 03, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

70%

Grant Probability

99%

With Interview (+35.9%)

2y 4m

Median Time to Grant

Low

PTA Risk

Based on 305 resolved cases by this examiner. Grant probability derived from career allow rate.