Last updated: May 29, 2026

Application No. 18/744,151

MANAGING AND STREAMING A PLURALITY OF LARGE-SCALE DATASETS

Non-Final OA §DOUBLEPATENT

Filed

Jun 14, 2024

Priority

Oct 15, 2020 — provisional 63/091,926 +1 more

Examiner

DUNPHY, DAVID F

Art Unit

2673

Tech Center

2600 — Communications

Assignee

Snark AI Inc.

OA Round

1 (Non-Final)

Interview Optional

— +10.3% interview lift. Interview lift (+10.3%) is below the 15.0% threshold. A written response is recommended.

Based on 769 resolved cases, 2023–2026

Examiner Intelligence

DUNPHY, DAVID F View full profile →

Grants 85% — above average

Career Allowance Rate

654 granted / 769 resolved

+23.0% vs TC avg

Moderate +10% lift

Without

With

+10.3%

Interview Lift

resolved cases with interview

Fast prosecutor

2y 2m

Avg Prosecution

13 currently pending

Career history

783

Total Applications

across all art units

Statute-Specific Performance

§101

2.8%

-37.2% vs TC avg

§103

72.3%

+32.3% vs TC avg

§102

9.0%

-31.0% vs TC avg

§112

6.1%

-33.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 769 resolved cases

Office Action

§DOUBLEPATENT

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Allowable Subject Matter
Claims 1-20 are currently subject to non-statutory double patent rejections, but are otherwise not subject to any prior art rejections under either 35 U.S.C. § 102 or 35 U.S.C. § 103. Assuming that the foregoing shortcomings of these claims were rectified by the timely filing of a terminal disclaimer, these claims would be allowable.
The following is a statement of reasons for the indication of allowable subject matter:
Independent claims 1 and 14 recite the same patentable features as were found allowable in parent application 17/450848 which issued as United States Patent No. 12,019,710. These claims are allowable for the same reasons as were provided in the parent application.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
Claim 1 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:

Present Application
Claim 1
U.S. Patent No. 12,019,710
Claim 1
A method, comprising:
by at least one processor:
A method for managing and streaming a plurality of large-scale datasets, the method comprising: identifying, by one or more processors, …
identifying one or more data types associated with a plurality of large-scale datasets, wherein each large-scale dataset of the plurality of large-scale datasets comprises a plurality of data elements; 
identifying, by one or more processors, one or more data types associated with the plurality of large-scale datasets, each large-scale dataset of the plurality of large-scale datasets comprising a plurality of data elements; 
transforming each data element of the plurality of data elements into a set of tensors for each data type of the one or more data types, wherein the transforming comprises: 
transforming, by the one or more processors, each data element of the plurality of data elements into a set of tensors for each data type of the one or more data types, …, wherein the transforming comprises: 
receiving a plurality of transformation functions concatenated together as a dependency directed acyclic graph to transform the plurality of large-scale datasets from a first form into a second form; and 
receiving a plurality of transformation functions concatenated together as a dependency directed acyclic graph to transform the plurality of large-scale datasets from one form into another…
storing, at a storage device, the transformed plurality of large-scale datasets based on each data type of the one or more data types.
storing, at a storage device, based on each data type, the plurality of large-scale datasets transformed…


	
Claim 2 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 2
U.S. Patent No. 12,019,710
Claim 1
The method of claim 1, wherein each data element of the plurality of data elements has an arbitrary shape and length, a set of data elements of the plurality of data elements ordered in a multi-dimensional space is treated as a dynamic tensor, and the storing comprises selecting a compression and storage strategy.
transforming… wherein the each data element of the plurality of data elements has an arbitrary shape and length and a set of data elements of the plurality of data elements ordered in a multi-dimensional space is treated as a dynamic tensor

storing, … wherein the storing comprises selecting, by the one or more processors, a compression and storage strategy…


	
Claim 3 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 3
U.S. Patent No. 12,019,710
Claim 1
The method of claim 2, wherein the plurality of transformation functions is user-defined, serverless lambda functions which apply arbitrary computation to a single sample or a subsection of a sample of a tensor of the set of tensors or a large-scale dataset of the plurality of large-scale datasets.
receiving… wherein the plurality of transformation functions are user-defined, serverless lambda functions which apply arbitrary computation to a single sample or a subsection of a sample of a tensor of the set of tensors or a large-scale dataset of the plurality of large-scale datasets


	
Claim 4 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 4
U.S. Patent No. 12,019,710
Claim 1
The method of claim 3, further comprising identifying a suitable compression kernel that is
personalized for each large-scale dataset of the plurality of large-scale datasets.
storing, …, wherein a suitable compression kernel that is personalized for each large-scale dataset of the plurality of large-scale datasets is identified.


	
Claim 5 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 2 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 5
U.S. Patent No. 12,019,710
Claim 2
The method of claim 1, wherein the plurality of large-scale datasets comprises at least one of
structured datasets, semi-structured datasets, or unstructured datasets.
The method of claim 1, wherein the plurality of large-scale datasets comprise at least one of structured datasets, semi-structured datasets and/or unstructured datasets.


	
Claim 6 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 3 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 6
U.S. Patent No. 12,019,710
Claim 3
The method of claim 1, wherein the one or more data types comprise at least one of an
image, a video, a text, an audio, numbers, or point cloud.
The method of claim 1, wherein the one or more data types comprise at least one of an image, video, text, audio, numbers, and/or point cloud.


	
Claim 7 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 4 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 7
U.S. Patent No. 12,019,710
Claim 4
The method of claim 1 further comprising providing a description corresponding to each data
type of the one or more data types, wherein a set of data types of the one or more data types
of varied length is stored into a single unified tensor preserving a shape of each large-scale
dataset of the plurality of large-scale datasets.
The method of claim 1 further comprising providing, by the one or more processors, a description corresponding to each data type, wherein a set of data types of the one or more data types of varied length is stored into a single unified tensor preserving a shape of each large-scale dataset of the plurality of large-scale datasets.


	
Claim 8 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 5 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 8
U.S. Patent No. 12,019,710
Claim 5
The method of claim 1, wherein the plurality of transformation functions is user-defined, and the plurality of transformation functions that is user-defined is applied on a large-scale dataset of the plurality of large-scale datasets as a whole by distributing the large-scale dataset to multiple cores locally or to machines on a cloud.
The method of claim 1, wherein the plurality of transformation functions that is user-defined is applied on a large-scale dataset of the plurality of large-scale datasets as a whole by distributing the large-scale dataset to multiple cores locally or to machines on a cloud.


	
Claim 9 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 6 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 9
U.S. Patent No. 12,019,710
Claim 6
The method of claim 1, wherein further comprising: mapping and computing remotely one or more transformation functions of the plurality of transformation functions to create a new large-scale dataset; and storing the new large-scale dataset on the storage device.
The method of claim 1, wherein the transforming further comprises mapping and computing remotely one or more transformation functions of the plurality of transformation functions to create a new large-scale dataset and storing the new large-scale dataset on the storage device.


	
Claim 10 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 7 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 10
U.S. Patent No. 12,019,710
Claim 7
The method of claim 1, wherein further comprising: chunking each tensor of the set of tensors into one or more chunks; and storing the one or more chunks one of locally on a file system or on a remote storage, wherein the remote storage comprises at least one of an object storage or a conventional database.
The method of claim 1, wherein the storing comprises performing chunking, by the one or more processors, of each tensor in the set of tensors into one or more chunks, and storing the one or more chunks either locally on a file system or on a remote storage, wherein the remote storage comprises at least one of an object storage or a conventional database.


	
Claim 11 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 8 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 11
U.S. Patent No. 12,019,710
Claim 8
The method of claim 10, further comprising storing the one or more chunks in a memory for
accessing slices within a chunk of the one or more chunks.
The method of claim 7 further comprising storing, by the one or more processors, the one or more chunks in a memory for accessing slices within a chunk of the one or more chunks.


	
Claim 12 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim  of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 12
U.S. Patent No. 12,019,710
Claim 9
The method of claim 1, further comprising executing unique versioning of the plurality of large-scale datasets, wherein a difference operator is expressed in terms of tensors and a sequence of commits is expressed as a superposition of linear transformations.
The method of claim 1 further comprising, performing, by the one or more processors, unique versioning of the plurality of large-scale datasets, wherein a difference operator is expressed in terms of tensors and a sequence of commits is expressed as a superposition of linear transformations.


	
Claim 13 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 10 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 13
U.S. Patent No. 12,019,710
Claim 10
The method of claim 1, further comprising executing asynchronous fetching of one or more large-scale datasets of the plurality of large-scale datasets from a local storage to a Graphics Processing Unit (GPU) using one or more data loaders, for training one or more machine learning models of a deep learning framework integrated with the plurality of large-scale datasets.
The method of claim 1 further comprising, performing asynchronous fetching of one or more large-scale datasets of the plurality of large-scale datasets from a local storage to a Graphics Processing Unit (GPU) using one or more data loaders, for training one or more machine learning models of a deep learning framework integrated with the plurality of large-scale datasets.
















(Continued on next page)
Claim 14 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 11 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 14
U.S. Patent No. 12,019,710
Claim 11
A system, comprising:
a memory;
a processor communicatively coupled to the memory, wherein the processor is configured to: 
A system for managing and streaming a plurality of large-scale datasets, the system comprising: a memory; 
a processor communicatively coupled to the memory, wherein the processor is configured to: 
identify one or more data types associated with a plurality of large-scale datasets, wherein each large-scale dataset of the plurality of large-scale datasets comprises a plurality of data elements; 
identify one or more data types associated with the plurality of large-scale datasets, each large-scale dataset of the plurality of large-scale datasets comprising a plurality of data elements; 
transform each data element of the plurality of data elements into a set of tensors for each data type of the one or more data types; 
transform each data element of the plurality of data elements into a set of tensors for each data type of the one or more data types…
receive a plurality of transformation functions concatenated together as a dependency directed acyclic graph to transform the plurality of large-scale datasets from a first form into a second form; and  
receive a plurality of transformation functions concatenated together as a dependency directed acyclic graph to transform the plurality of large-scale datasets from one form into another…
store, at a storage device, the transformed plurality of large-scale datasets based on each data type of the one or more data types.
and store, at a storage device, based on each data type, the plurality of large-scale datasets transformed, wherein the processor is configured to select a compression and storage strategy, wherein a suitable compression kernel that is personalized foreach large-scale dataset of the plurality of large-scale datasets is identified.


	
Claim 15 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 11 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 15
U.S. Patent No. 12,019,710
Claim 11
The system of claim 14, wherein each data element of the plurality of data elements has an arbitrary shape and length, and a set of data elements of the plurality of data elements ordered in a multi-dimensional space is treated as a dynamic tensor.
transform… wherein the each data element of the plurality of data elements has an arbitrary shape and length and a set of data elements of the plurality of data elements ordered in a multi-dimensional space is treated as a dynamic tensor, wherein the processor is configured to:


	
Claim 16 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 11 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 16
U.S. Patent No. 12,019,710
Claim 11
The system of claim 15, wherein the plurality of transformation functions is user-defined, serverless lambda functions which apply arbitrary computation to a single sample or a subsection of a sample of a tensor of the set of tensors or a large-scale dataset of the plurality of large-scale datasets.
receive… wherein the plurality of transformation functions are user-defined, serverless lambda functions which apply arbitrary computation to a single sample or a subsection of a sample of a tensor of the set of tensors or a large-scale dataset of the plurality of large-scale datasets;


	
Claim 17 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 11 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 17
U.S. Patent No. 12,019,710
Claim 11
The system of claim 16, wherein the processor is further configured to identify a suitable compression kernel that is personalized for each large-scale dataset of the plurality of largescale datasets.
store… wherein the processor is configured to select a compressionand storage strategy, wherein a suitable compression kernel that is personalized foreach large-scale dataset of the plurality of large-scale datasets is identified.


	
Claim 18 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 12 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 18
U.S. Patent No. 12,019,710
Claim 12
The system of claim 14, wherein the plurality of large-scale datasets comprises at least one of structured datasets, semi-structured datasets, or unstructured datasets.
The system of claim 11, wherein the plurality of large-scale datasets comprise at least one of structured datasets, semi-structured datasets and/or unstructured datasets.


	
Claim 19 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 13 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 19
U.S. Patent No. 12,019,710
Claim 13
The system of claim 14, wherein the one or more data types comprise at least one of an image, a video, a text, an audio, numbers, or point cloud.
The system of claim 11, wherein the one or more data types comprise at least one of an image, video, text, audio, numbers, and/or point cloud.


	
Claim 20 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 16 of U.S. Patent No. 12,019,710. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 20
U.S. Patent No. 12,019,710
Claim 16
The system of claim 14, wherein the processor is further configured to: map and compute remotely one or more transformation functions of the plurality of transformation functions to create a new large-scale dataset; and store the new large-scale dataset on the storage device.
The system of claim 11, wherein the processor is configured to map and compute remotely one or more transformation functions of the plurality of transformation functions to create a new large-scale dataset and store the new large-scale data set on the storage device.


	
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAVID F DUNPHY whose telephone number is (571)270-1230. The examiner can normally be reached 9 am - 5 pm. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chineyere Wills-Burns can be reached at (571) 272-9752. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/DAVID F DUNPHY/Primary Examiner, Art Unit 2673

Read full office action

Prosecution Timeline

Jun 14, 2024

Application Filed

Mar 24, 2026

Non-Final Rejection mailed — §DOUBLEPATENT (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/395,274

Patent 12626479

DIFFERENCE DETECTION DEVICE AND METHOD

2y 4m to grant Granted May 12, 2026

18/589,677

Patent 12621411

Methods and Apparatus for Displaying, Compressing and/or Indexing Information Relating to a Meeting

2y 2m to grant Granted May 05, 2026

18/302,230

Patent 12608926

METHOD AND DEVICE FOR TRAINING A NEURAL NETWORK

3y 0m to grant Granted Apr 21, 2026

18/555,171

Patent 12608934

System and Method for Estimating Dynamic Soil Parameters Based on Multispectral or Hyperspectral Image

2y 6m to grant Granted Apr 21, 2026

18/319,214

Patent 12596884

NATURAL LANGUAGE PROCESSING BASED DOMINANT ITEM DETECTION IN VIDEOS

2y 10m to grant Granted Apr 07, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

85%

Grant Probability

95%

With Interview (+10.3%)

2y 2m (~2m remaining)

Median Time to Grant

Low

PTA Risk

Based on 769 resolved cases by this examiner. Grant probability derived from career allowance rate.