Last updated: May 29, 2026

Application No. 18/448,029

SYSTEMS AND METHODS FOR FILTERING OF COMPUTER VISION GENERATED TAGS USING NATURAL LANGUAGE PROCESSING

Non-Final OA §101§103

Filed

Aug 10, 2023

Priority

Dec 31, 2015 — continuation of 14/986,219 +1 more

Examiner

MOSER, BRUCE M

Art Unit

2154

Tech Center

2100 — Computer Architecture & Software

Assignee

Entefy Inc.

OA Round

3 (Non-Final)

Interview Optional

— +20.1% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 84% grant rate with +20.1% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 746 resolved cases, 2023–2026

Examiner Intelligence

MOSER, BRUCE M View full profile →

Grants 84% — above average

Career Allowance Rate

629 granted / 746 resolved

+29.3% vs TC avg

Strong +20% interview lift

Without

With

+20.1%

Interview Lift

resolved cases with interview

Typical timeline

2y 8m

Avg Prosecution

30 currently pending

Career history

794

Total Applications

across all art units

Statute-Specific Performance

§101

11.5%

-28.5% vs TC avg

§103

38.4%

-1.6% vs TC avg

§102

35.9%

-4.1% vs TC avg

§112

7.2%

-32.8% vs TC avg

Black line = Tech Center average estimate • Based on career data from 746 resolved cases

Office Action

§101 §103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Detailed Action
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 10/10/25 has been entered.
 	In amendments dated 10/10/25, Applicant amended claims 2, 9, and 16, canceled no claims, and added no new claims.  Claims 2-21 are presented for examination.
	Applicant is advised that the instant application is now being examined by Examiner Bruce Moser.

Rejections under 35 U.S.C. 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 2-21 are rejected under 35 U.S.C. 101 because the claimed invention is directed to mental processes without significantly more. Independ clams 2, 9, and 16 each recites determining, using one or more hardware processors coupled to a non-transitory memory of a system for an artificial intelligence (Al) model pipeline, content in a media file using an image analyzer Al model of a plurality of computer vision Al models of the Al model pipeline; selecting, using the one or more hardware processors, a subset of the plurality of computer vision Al models usable to analyze the media file based on the content, wherein the subset comprises a first Al model trained for a first image classification analysis and a second Al model trained for a second image classification analysis; executing, using the one or more hardware processors, a first run of the first Al model for the first image classification analysis based on the content; generating, using the one or more hardware processors, a first set of computer vision tags for the media file based on the first run; executing, using the one or more hardware processors, a second run of the second Al model for the second image classification analysis based on the content; generating, using the one or more hardware processors, a second set of computing vision tags for the media file based on the first run; segmenting a plurality of objects from the media file using the first set and the second set of computing vision tags; concatenating the plurality of objects in the media file that prevents, for each of the plurality of objects, different ones of the plurality of objects from being combined when segmented from the media file for generating a plurality of first computer vision tags for the plurality of objects segmented from the media file; performing an intra-model fusion of outputs from at least the first and second Al models of the AI model pipeline based on the concatenated plurality of objects and the first set and the second set of computing vision tags; generating, using the one or more hardware processors and based on the intra-model fusion, the plurality of first computer vision tags for the media file, wherein each of the plurality of first computer vision tags is associated with a confidence value that each of the plurality of objects is properly labeled; filtering, using the one or more hardware processors, the plurality of first computer vision tags based on the confidence values and a Natural Language Processing (NLP) model, wherein the filtering removes a portion of the plurality of first computer vision tags based on first corresponding ones of the confidence values at or below a predetermined threshold and prioritizes a remaining portion of the plurality of first computer vision tags based on a ranking of second corresponding ones of the confidence values; and tagging, using the one or more hardware processors, the content in the media file based on the filtered plurality of first computer vision tags. Determining content is evaluating and a mental process, selecting AI models is evaluating and a mental process, generating first and second sets of tags are recited broadly and are mental processes accomplishable in the human mind or on paper, executing first and second runs of an AI model are not significantly more than mental processes per Recentive Analytics v. Fox Broadcasting Corp. (134 F.4th 1205, 2025 U.S.P.Q.2d 628), segmenting and concatenating objects and performing a fusion of outputs are each recited broadly and are mental processes accomplishable in the human mind or on paper, generating and filtering the plurality of tags are each recited broadly and are mental processes accomplishable in the human mind or on paper, and tagging content is a mental process accomplishable in the human mind or on paper.  Each claim recites an additional element of outputting, from the Al model pipeline using the one or more hardware processors, the media file comprising the tagged content having the filtered plurality of first computer vision tags searchable by a search process in place of performing an Al visual analysis of the content, which is an output step and insignificant extra-solution activity. Claim 9 recites a non-transitory memory and one or more hardware processors, and claim 16 recites a non-transitory computer readable medium comprising computer readable instructions, which are each generic components of a computer.  Examiner notes specification paragraphs 0004-0006 recites drawbacks in the technology for object identification in images, videos, and other content (some images and/or videos lack meaningful tags or descriptions causing users to be unable to discover said content via search or any means other than direct user lookup, deep learning has been successful in identifying some information in images, a human-comparable automatic annotation of images and videos (comparable to deep learning identifying information in images) such as producing natural-language descriptions solely from visual data is still far from being achieved, and recognition parameters are not personalized at a user level and may not account for user preferences in searches).  Specification paragraphs 0018-0025 describe techniques in the invention for addressing the above drawbacks but the claim steps do not recite a particular improvement in any technology or function of a computer per MPEP 2106.04(d) and do not recite any unconventional steps in the invention per MPEP 2106.05(a).  Therefore, the recited mental processes are not integrated into a practical application.  Taking the claims as a whole, the output step is recited broadly and amounts to sending data across a network per specification paragraph 0026 figure 1A computer networks 101, which is routine and conventional activity per the list of such activities in MPEP 2106.05(d) part II.  The non-transitory memory, one or more hardware processors, and non-transitory computer readable medium comprising computer readable instructions are still each generic components of a computer.  Thus the claims do not include additional elements that are sufficient to amount to significantly more than the recited mental processes.
Claims 3, 10, and 17 each recites wherein the subset of the plurality of computer vision Al models is further selected based on user preferences for a user performing a search associated with the media file, and selecting AI models is evaluating and a mental process.  Claims 4, 11, and 18 each recites determining, using the one or more hardware processors, the user preferences based on at least one of past searches for past content in past media files by the user or ones of the plurality of computer vision Al models usable for identifying the past content for the past searches, and determining user preferences is evaluating and a mental process.  Claims 5, 12, and 19 each recites identifying, using the one or more hardware processors, one of the plurality of first computer vision tags having a corresponding one of the confidence values at or below the predetermined threshold, and identifying a confidence value per a threshold is evaluating and a mental process; reprocessing, using the one or more hardware processors, the one of the plurality of first computer vision tags using the subset of the plurality of computer vision AI models and the NLP model, and reprocessing by applying an AI model is not significantly more than mental processes per Recentive Analytics v. Fox Broadcasting Corp. (134 F.4th 1205, 2025 U.S.P.Q.2d 628); determining, using the one or more hardware processors, that the one of the plurality of first computer vision tags is an irrelevant tag based on the reprocessing, and determining a tag is irrelevant is evaluating and a mental process; and discarding the one of the plurality of first computer vision tags based on being the irrelevant tag, and discarding a tag as irrelevant is a mental process accomplishable in the human mind or on paper.
Claims 6, 13, and 20 each recites wherein the determining the content includes determining a plurality of second computer vision tags initially used to tag the content in the media file, and wherein the selecting is further based on the plurality of second computer vision tags, and determining tags is evaluating and a mental process.  Claims 7, 14, and 21 each recites extracting, using the one or more hardware processors, a plurality of frames from the media file based on the content and the second plurality of computer vision tags, and extracting frames is recited broadly and amounts to receiving data across a network and is routine and conventional per the list of such activities in MPEP 2106.05(d) part II; and building, using the one or more hardware processors, at least one scene using the extracted plurality of frames, which is recited broadly and a mental process accomplishable in the human mind or on paper, wherein the executing the first run and the second run are further based on the built at least one scene, and executing an AI model is not significantly more than mental processes per Recentive Analytics v. Fox Broadcasting Corp. (134 F.4th 1205, 2025 U.S.P.Q.2d 628).  Claims 8 and 15 each recites wherein the plurality of computer vision AI models comprises at least one of an object segmentation model, an object localization model, an object detection and recognition model, the NLP model, or a relevance feedback loop model, and using an AI model is applying it which is not significantly more than mental processes per Recentive Analytics v. Fox Broadcasting Corp. (134 F.4th 1205, 2025 U.S.P.Q.2d 628).

Relevant Prior Art
	During his search for prior art, Examiner found the following references to be relevant to Applicant's claimed invention.  Each reference is listed on the Notice of References form included in this office action:
	Mishra (US 9,465,994) teaches predicting performance of vision algorithms for imaging data, determining characteristics or attributes of imaging data such as tags, determining content of the images, predicting the most appropriate of the algorithms, running the algorithm and generating the tags, does not teach segmentation using the tags or object concatenation in the images, or intra-model fusion using output from the algorithms or filtering of the tags (columns 2-3 lines 64-46, column 4 lines 3-25, columns 8-9 lines 42-8); and 
	Garrigues et al (US 9,218,364) teaches providing a user with tags for video or image content and techniques for combining and segmenting images prompting the tagging os said images, does not teach segmentation using the tags or object concatenation in the images, or intra-model fusion using output from the algorithms or filtering of the tags (column 3 lines 8-54, column 10 lines 1036 figure 4A).

Responses to Applicant’s Remarks
	Regarding rejections of claims 2-21 under 35 U.S.C. 101 for reciting mental processes without significantly more, Applicant’s arguments have been considered and are not persuasive.  On pages 11-12 of his Remarks Applicant asserts, under Step 2 Prong Two of the Eligibility Analysis, “the claims provide specific improvements in technology so as to be limited to a practical application that improves over the prior systems.”  Examiner disagrees and notes the claims do not recite specific details reciting how the invention implements an Al pipeline of individual AI models and fuses computer vision tags generated by such models into a set of tags that better identifies objects in images.  The claims recite conclusory statements of determining content in a media file, selecting a subset of models, executing the models, generating tags, segmenting objects using the tags, concatenating objects in the media file, performing an intra-model fusion of output from the executed models, generating the tags, filtering the tags, and tagging the determined content.  Thus Examiner does not believe the claims recite a particular improvement in any technology or function of a computer per MPEP 2106.04(d) and are not integrated into a practical application.
Regarding rejections of claims 2-21 under 35 U.S.C. 103 by Pesavento, Givental, Weisel, and Dunn, Applicant’s arguments on pages 12-14 of his Remarks have been considered and are persuasive that Applicant’s amendments overcome these references’ teachings.

Inquiry

Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRUCE M MOSER whose telephone number is (571)270-1718. The examiner can normally be reached M-F 9a-5p.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Boris Gorney can be reached at 571 270-5626. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BRUCE M MOSER/Primary Examiner, Art Unit 2154                                                                                                                                                                                                        5/15/26

Read full office action

Prosecution Timeline

Show 6 earlier events

Jul 11, 2025

Final Rejection mailed — §101, §103

Sep 20, 2025

Interview Requested

Oct 03, 2025

Interview Requested

Oct 08, 2025

Applicant Interview (Telephonic)

Oct 08, 2025

Examiner Interview Summary

Oct 10, 2025

Request for Continued Examination

Oct 15, 2025

Response after Non-Final Action

May 19, 2026

Non-Final Rejection mailed — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/669,293

Patent 12602403

SCALABLE PARALLEL CONSTRUCTION OF BOUNDING VOLUME HIERARCHIES

4y 2m to grant Granted Apr 14, 2026

18/464,356

Patent 12585717

System and Method for Recommending Users Based on Shared Digital Experiences

2y 6m to grant Granted Mar 24, 2026

19/048,422

Patent 12579198

TEXT STRING COMPARISON FOR DUPLICATE OR NEAR-DUPLICATE TEXT DOCUMENTS IDENTIFIED USING AUTOMATED NEAR-DUPLICATE DETECTION FOR TEXT DOCUMENTS

1y 1m to grant Granted Mar 17, 2026

18/233,339

Patent 12554783

USING DISCOVERED UNIFORM RESOURCE IDENTIFIER INFORMATION TO PERFORM EXPLOITATION TESTING

2y 6m to grant Granted Feb 17, 2026

18/178,859

Patent 12530419

DATA MANAGEMENT APPARATUS, DATA MANAGEMENT METHOD, AND NON-TRANSITORY RECORDING MEDIUM

2y 10m to grant Granted Jan 20, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4

Expected OA Rounds

84%

Grant Probability

99%

With Interview (+20.1%)

2y 8m (~0m remaining)

Median Time to Grant

High

PTA Risk

Based on 746 resolved cases by this examiner. Grant probability derived from career allowance rate.