Last updated: May 29, 2026

Application No. 18/987,285

SEQUENCE DATA PROCESSING, RETENTION, AND RECOVERY

Final Rejection §102§103

Filed

Dec 19, 2024

Priority

Dec 21, 2023 — provisional 63/613,287

Examiner

HU, JENSEN

Art Unit

2169

Tech Center

2100 — Computer Architecture & Software

Assignee

Illumina, Inc.

OA Round

2 (Final)

Interview Optional

— +26.9% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 68% grant rate with +26.9% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 540 resolved cases, 2023–2026

Examiner Intelligence

HU, JENSEN View full profile →

Grants 68% — above average

Career Allowance Rate

366 granted / 540 resolved

+12.8% vs TC avg

Strong +27% interview lift

Without

With

+26.9%

Interview Lift

resolved cases with interview

Typical timeline

3y 7m

Avg Prosecution

10 currently pending

Career history

553

Total Applications

across all art units

Statute-Specific Performance

§101

3.7%

-36.3% vs TC avg

§103

75.0%

+35.0% vs TC avg

§102

19.4%

-20.6% vs TC avg

§112

1.1%

-38.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 540 resolved cases

Office Action

§102 §103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claims 1-27 are pending in this application.

Claim Rejections - 35 USC § 102

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claim(s) 1-7, 9-27 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Bhola et al., US 2013/0031092 (hereinafter Bhola).

For claims 1, 14, 21, Bhola teaches a computer-implemented method comprising: 
obtaining sequence data produced by a sequencer device, the sequence data comprising genomic data of interest and metadata (see Bhola, [0007], [0032] – [0033], [0037], FASTQ “storing both a biological sequence (usually, a nucleotide sequence) and its corresponding quality score,” and “FASTQ file is input to the compressor” where FASTQ input, that contains biological sequence and quality score, into compressor represents obtained sequence data and metadata); 
processing the sequence data, the processing comprising: 
separating the genomic data of interest from the metadata (see Bhola, Fig. 2C, [0033], [0037] - [0040] “FASTQ parser 203 splits the file into text fields, DNA sequence and quality values” where “DNA sequence data 207” is separated from “quality data 208” and “text fields 206”); and 
compressing the separated genomic data of interest based on a reference sequence to produce compressed genomic data (see Bhola, Fig. 2C shows DNA sequence data 207 is passed to the “compressor” or “data encoder” for compression, and Fig. 6 shows subsequent compression or “encoding a DNA sequence” of referenced  DNA sequence data 207, [0032], [0037], [0039], [0040] – [0041], [0052] – [0053], [0069], “identifies palindromic repeats in the DNA sequence” and determines “most efficient encoding method…to encoding each of these repeats” [0059] “encoding a DNA sequence” based on “repeats” in DNA sequence analyzed, where compressing/encoding of DNA sequence data based on “repeats” represents compressing based on reference sequence); and 
storing the compressed genomic data and the metadata (see Bhola, Fig. 2C, [0037] – [0038], [0059] – [0060], encoded DNA sequence, text data and quality data are “fed into the merger 209 which outputs the unified compressed bit stream” representing stored compressed genomic and metadata).

For claims 2, 15, 22, teaches wherein the separating uses a configuration file that indicates indexes, and the separating separates the genomic data of interest from the metadata based on the indexes indicated by the configuration file (see Bhola, [0040], [0059], “dynamic dictionary is used to find the repeats” that includes “hash table…for indexing and fast repeat finding” of sequences for parsing of input and separation from metadata).

For claim 3, Bhola teaches the method of claim 1, wherein the metadata comprises index data for a plurality of reads (see Bhola, [0039] – [0040], where parsed “text data”, representing index metadata, includes “title lines…of repeating and variable fields” for utilizing in “sequence read” operations).

For claims 4, 16, 23, Bhola teaches wherein the separating comprises using the index data to demultiplex at least a portion of the sequence data to provide the separated genomic data of interest as per-sample genomic data (see Fig. 2C, [0005], [0029], [0037] – [0040], [0059], “parser 203 splits the file” into at least one “DNA sequence” that separates data represents demultiplex, and where a derived “DNA sequence” represents a single sample of an organism that is at least one per-sample genomic data), wherein the compressing provides compressed per-sample genomic data as the compressed genomic data, and wherein the storing stores the index data (see Bhola, Fig. 2C [0037] – [0040], [0059] – [0060] where “compressed” data includes encoded/compressed “DNA sequence” per-sample data and encoded text data 206).

For claim 5, Bhola teaches the method of claim 1, wherein the processing trims at least some of the metadata from other data of the sequence data (see Bhola, Fig. 2C, [0039], when encoding title lines, “stores the repeating fields only once as part of the header” means trimming other repeated fields, [0054], “method discards the encoding of the field in which an appropriate flag to indicate the same is set”).

For claims 6, 17, 24, Bhola teaches wherein the trimmed metadata comprises adapter data, Unique Molecular Identifiers (UMI) data, and/or data selected to be ignored, and wherein the storing stores each of the adapter data, Unique Molecular Identifiers (UMI) data, and/or data selected to be ignored (see Bhola, [0039], [0054] where discarded “encoding” or “repeating” data represents data to be ignored for storing).

For claim 7, Bhola teaches the method of claim 1, wherein the processing further comprises compressing the metadata to provide compressed metadata, wherein the storing stores the compressed metadata (see Bhola, Fig. 2C, [0037] – [0038], [0068], encoding/compressing text data for “merging”).

For claim 9, Bhola teaches the method of claim 1, wherein the storing stores the compressed genomic data in one or more data files that also store the metadata (see Bhola, Fig. 2C, [0037] – [0038], [0059], [0068], “merging” “text data 206” and “sequence data 207” to be “compressed” together).

For claims 10, 18, 25, Bhola teaches further comprising, based on a request, recovering the sequence data from the stored compressed genomic data and metadata (see Bhola, [0037] - [0038], where “decompression” of sequence data represents recovery), the recovering comprising: 
decompressing the compressed genomic data to provide decompressed genomic data of interest as the separated genomic data of interest (see Bhola, [0037] - [0038], “decompression” of “compressed bit stream”); and 
combining the decompressed genomic data of interest with the metadata to provide combined genomic data and metadata (see Bhola, [0038], where “output the decompressed fields” represents combined genomic data and metadata).

For claims 11, 19, 26, Bhola teaches wherein the metadata comprises index data for a plurality of reads (see Bhola, [0039] – [0040], where parsed “text data”, representing index metadata, includes “title lines…of repeating and variable fields” for utilizing in “sequence read” operations), wherein the separating comprises using the index data to demultiplex at least a portion of the sequence data to provide the separated genomic data of interest as per-sample genomic data (see Bhola, [0005], [0029], [0037] – [0040], [0059] – [0060], determine repeated sequences from index metadata, text data, and separate repeated DNA sequences from the input to compress for the given input sample), wherein the compressing provides compressed per-sample genomic data as the compressed genomic data, wherein the storing stores the index data (see Bhola, [0037] – [0040], [0059] – [0060] where “compressed” data includes “merger” of “DNA sequence” sample and “text data”), and wherein the combining comprises remultiplexing the decompressed genomic data of interest with the metadata to provide the combined genomic data and metadata (see Bhola, [0038], “output of the decompressed fields” such that “original FASTQ file is output” represents remultiplexing the decompressed data).

For claims 12, 20, 27, Bhola teaches wherein the processing further comprises compressing the metadata to provide compressed metadata, wherein the storing stores the compressed metadata (see Bhola, [0037], “merger” of “text data” with “sequence data” for “compressed bit stream” represents compressing metadata), and wherein the recovering further comprises decompressing the compressed metadata to provide decompressed metadata as the metadata that is combined with the decompressed genomic data of interest (see Bhola, [0038], “decompression” of compressed bit stream to yield “decompressed fields” of metadata and sequence data).

For claim 13, Bhola teaches the method of claim 1, further comprising sequencing, by the sequencer device, genomic material to produce and obtain the sequence data, wherein the sequencer device performs the obtaining, the processing, and the storing, and wherein the storing stores the compressed genomic data and the metadata to a storage device of the sequencer device (see Bhola, [0037] – [0038], [0040], [0059] - [0060], “sequence read” of input file via utilization of “dynamic dictionary” represents sequencing for compression and storage in “compressed file”).

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bhola et al., US 2013/0031092 (hereinafter Bhola) in view of Alberti et al., US 2023/0274800 (hereinafter Alberti).

For claim 8, Alberti teaches wherein the storing stores the compressed genomic data in a first one or more data files and stores the metadata in a second one or more data files different from the first one or more data files (see Alberti, [0048], [0051], [0169] – [0174] “storing or transmitting the encoded genome sequencing data on or to a computer-readable storage medium,” [0190], “Said sequencing reads and the associated annotations can as well be decoupled and encapsulated in separate files”).  It would have been obvious to one skilled in the art at the time of the invention to modify the teachings of Bhola with the teachings of Alberti to reduce the storage space for compressed representations of genomic sequencing data (see Alberti, [0011] – [0017] “compressing separately non-indexed descriptors from indexed textual descriptors is that these 2 classes of data, once separately grouped, show a lower entropy than when they are coded together, therefore higher compression ratio can be achieved”).

Response to Arguments

Applicant's arguments with respect to claims rejected under 35 U.S.C. 102(a)(1) and 35 U.S.C. 103 have been fully considered but they are not persuasive. 

The applicant argues “Bhola’s compression based on repeating sub-strings relates to compression of the title lines.  The title lines are part of the “text data” (see paragraph [0039] of Bhola), which, as noted above, the Office Action interpreted to be the claimed metadata rather than separated genomic data of interest.  In other words, the compression based on sub-string repetition related to compression of FASTQ metadata in Bhola, rather than “separated genomic data of interest.”  The examiner respectfully disagrees.

The applicant has narrowly interpreted Bhola to only comprise compression of title line text data.  However, as Fig. 2C and Fig. 3 show, “sequence data 207” is also separated from the input file and encoded/compressed based on repeated sequence (reference) data (see Bhola, Figs. 2C, 3, [0039] – [0041], [0059] – [0060]).  Specifically the Bhola teaches “encoding a DNA sequence” based on determined “repeat” sequences (see Bhola, [0059]).  Accordingly, while Bhola teaches compression of FASTQ metadata, Bhola also teaches compression of DNA sequence data.

The applicant argues “Bhola fails to provide any teaching of using reference-based compression per se in its compression.”  The examiner respectfully disagrees.  As disclosed in the corresponding rejection above, Bhola teaches a method of determining “certain sub-strings repeating across” the title lines and further utilizing the “repeat finding” to compress/encode the “DNA sequence” (see Bhola, Fig. 2C, [0040] – [0041], [0059]).

The applicant argues Bhola fails to teach “demultiplex at least a portion of the sequence data to provide the separated genomic data of interest as per-sample genomic data.”  The examiner respectfully disagrees.  

First, one skilled in the art at the time of the time of the invention would interpret “per-sample genomic data” simply as sample genomic data from a single organism.  As disclosed in the corresponding rejection above, Bhola teaches a method of receiving a “DNA sequence” wherein the received sample sequence is from a single organism (see Bhola, [0037], [0040], [0059]).  In other words, a received and analyzed DNA sample from a single organism represents a per-sample genomic data.  Examiner notes that the claims do not recite receiving, identifying, and compressing multiple samples from different organisms as the applicant argues.  Second, Bhola teaches demultiplexing by reciting a process of separating multiple data signals - combined into a single file – back into individual data signals.  For example, Bhola teaches a method of receiving a FASTQ input that includes text data 206, sequence data 207, and quality data 208 combined in one file.  The data is then parsed/separated into multiple individual signals of text data, sequence data, and quality data for individual encoding (see Bhola, Fig. 2C, [0037] – [0041], [0059]).

Conclusion

THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JENSEN HU whose telephone number is (571)270-3803. The examiner can normally be reached Monday - Friday 9-5 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sherief Badawi can be reached at 571-272-9782.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/JENSEN HU/Primary Examiner, Art Unit 2169

Read full office action

Prosecution Timeline

Dec 19, 2024

Application Filed

Sep 04, 2025

Non-Final Rejection mailed — §102, §103

Dec 03, 2025

Examiner Interview Summary

Dec 03, 2025

Applicant Interview (Telephonic)

Dec 04, 2025

Response Filed

Apr 02, 2026

Final Rejection mailed — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

16/734,035

Patent 12639305

METHOD FOR SHARING LANDMARKS FOR FAST PROCESSING OF TOP K CHEAPEST PATH QUERIES

6y 4m to grant Granted May 26, 2026

18/246,568

Patent 12632470

SYSTEM AND METHOD FOR TIME-SPATIAL DATA PARTITIONING IN A BLOCKCHAIN NETWORK

3y 1m to grant Granted May 19, 2026

18/623,257

Patent 12613866

PLACEMENT OF ADAPTIVE AGGREGATION OPERATORS AND PROPERTIES IN A QUERY PLAN

2y 0m to grant Granted Apr 28, 2026

17/916,692

Patent 12608347

INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING APPARATUS, SERVER APPARATUS, PROGRAM, OR METHOD

3y 6m to grant Granted Apr 21, 2026

18/160,850

Patent 12608375

STATIC APPROACH TO LAZY MATERIALIZATION IN DATABASE SCANS USING PUSHED FILTERS

3y 2m to grant Granted Apr 21, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4

Expected OA Rounds

68%

Grant Probability

95%

With Interview (+26.9%)

3y 7m (~2y 2m remaining)

Median Time to Grant

Moderate

PTA Risk

Based on 540 resolved cases by this examiner. Grant probability derived from career allowance rate.