Office Action Analysis: 17825220 — IDENTIFICATION OF MATCHED SEGMENTED IN PAIRED DATASETS

Office Action

§101 §103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Status
Claims 1-20 are currently pending and examined on the merits.
Claims 1-20 are rejected.

Priority
	The instant application claims priority to U.S. Provisional Application 63/193,788 filed on 27 May 2021. At this point in examination, the effective filing date of claims 1-20 is 27 May 2021.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 3 April 2023 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement has been considered by the examiner.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1, 14, and 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The term “within” in claim 1, last line, claim 14, line 22, and claim 20, last line is a relative term which renders the claim indefinite. The term “within” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. It is unclear how the matched segment is contained "within" two homogeneous mismatched locations as it could mean being in between two locations or a range of two locations upstream and downstream of the matched segment. The specification is also silent as to which definition the term "within" takes. For examination purposes, the term "within" has been construed to be being in between two homogeneous mismatched locations.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


	Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claims recite: (a) mathematical concepts, (e.g., mathematical relationships, formulas or equations, mathematical calculations); and (b) mental processes, i.e., concepts performed in the human mind, (e.g., observation, evaluation, judgement, opinion).
	Subject matter eligibility evaluation in accordance with MPEP 2106:
	Eligibility Step 1: Claims 1-13 are directed to a method (process) for identifying one or more segments of a target dataset that match segments of other datasets in a database. Claims 14-19 are directed to a system (machine). Claim 20 is directed to a non-transitory computer-readable storage medium (machine). Therefore, these claims are encompassed by the categories of statutory subject matter, and thus satisfy the subject matter eligibility requirements under Step 1.
	[Step 1: YES]
	Eligibility Step 2A: First, it is determined in Prong One whether a claim recites a judicial exception, and if so, then it is determined in Prong Two whether the recited judicial exception is integrated into a practical application of that exception.
	Eligibility Step 2A, Prong One: In determining whether a claim is directed to a judicial exception, examination is performed that analyzes whether the claim recites a judicial exception, i.e., whether a law of nature, natural phenomenon, or abstract idea is set forth described in the claim.
	Claims 1-6, 8-10, and 14-20 recite the following steps which fall within the mental processes and/or mathematical concepts groups of abstract ideas, as noted below.
	Independent claims 1, 14, and 20 further recite:
identifying one or more segments of a target dataset that match segments of other datasets in a database (i.e., mental processes);
encoding the target dataset to generate a pair of encoded target bitmap sequences based on an encoding scheme (i.e., mental processes);
the encoding scheme defines encoding values based on homogeneity between the pair of data value sequences (i.e., mental processes);
the pair of encoded target bitmap sequences comprises a first encoded target bitmap sequence that encodes a first type of homogeneous locations and a second encoded target bitmap sequence that encodes a second type of homogeneous locations (i.e., mental processes);
comparing the pair of encoded target bitmap sequences with other pairs of encoded bitmap sequences to identify homogeneous mismatched locations (i.e., mental processes);
identifying a matched segment between the target dataset and one of the other datasets based on the homogeneous mismatched locations identified (i.e., mental processes).
Dependent claims 2 and 15 further recite:
wherein comparing the pair of encoded target bitmap sequences with the other pairs of encoded bitmap sequences to identify homogeneous mismatched locations (i.e., mental processes);
sampling the pair of encoded target bitmap sequences to generate a pair of sparse target bitmap sequences (i.e., mental processes);
comparing the pair of sparse target bitmap sequences to other pairs of sparse bitmap sequences (i.e., mental processes).
Dependent claims 3 and 16 further recite:
wherein identifying the matched segment between the target dataset and one of the other datasets based on homogeneous mismatched locations identified (i.e., mental processes);
using the comparison between the pair of sparse target bitmap sequences to other pairs of sparse bitmap sequences as a pre-scan to eliminate mismatches (i.e., mental processes);
comparing, responsive to one of the other datasets passing the pre-scan, the target dataset and said one of the other datasets to identify the matched segment (i.e., mental processes).
Dependent claims 4 and 17 further recite:
wherein comparing the pair of encoded target bitmap sequences with one of the other pairs of encoded bitmap sequences to identify the homogeneous mismatched locations (i.e., mental processes);
identifying a seed range of match between the target dataset and another dataset corresponding to said one of the other pairs of encoded bitmap sequences (i.e., mental processes);
comparing the pair of sparse target bitmap sequences with one of the other pairs of sparse bitmap sequences upstream and downstream of the seed range to identify the homogeneous mismatched locations (i.e., mental processes).
Dependent claim 5 further recites:
wherein comparing the pair of encoded target bitmap sequences with one of the other pairs of encoded bitmap sequences upstream and downstream of the seed range stops at a threshold range (i.e., mental processes).
Dependent claim 6 further recites:
wherein identifying a matched segment between the target dataset and one of the other datasets based on the homogeneous mismatched locations identified (i.e., mental processes);
comparing the pair of encoded target bitmap sequences and a pair of encoded bitmap sequences corresponding to said one of the other datasets location-by-location to identify the homogeneous mismatched locations (i.e., mental processes);
identifying a candidate segment that is between two homogeneous mismatched locations (i.e., mental processes);
determining a length of the candidate segment (i.e., mental processes);
determining, responsive to the length being larger than a threshold, that the candidate segment is a matched segment (i.e., mental processes).
Dependent claims 8 and 18 further recites:
wherein comparing the pair of encoded target bitmap sequences with another pair of encoded bitmap sequences to identify homogeneous mismatched locations (i.e., mental processes);
comparing the first encoded target bitmap sequence that encodes the first type of homogeneous locations of the target dataset to a second encoded bitmap sequence of said another pair, the second encoded bitmap sequence encoding the second type of homogeneous locations of another dataset (i.e., mental processes);
identifying a common location that indicates the target dataset and the other dataset in comparison are both homogeneous (i.e., mental processes).
Dependent claims 9 and 19 further recites:
wherein comparing the first encoded target bitmap sequence that encodes the first type of homogeneous locations of the target dataset to the second encoded bitmap sequence of said another pair (i.e., mental processes).
Dependent claim 10 further recites:
wherein the encoding scheme defines that the first encoded target bitmap sequence has a first value if the pair of data value sequences are homogeneous of the first type and has a second value otherwise, and the encoding scheme defines that the second encoded target bitmap sequence has the first value if the pair of data value sequences are homogeneous of the second type and has the second value otherwise (i.e., mental processes).
The abstract ideas recited in the claims are evaluated under the broadest reasonable interpretation (BRI) of the claim limitations when read in light of and consistent with the specification. As noted in the foregoing section, the claims are determined to contain limitations that can practically be performed in the human mind with the aid of a pencil and paper, and therefore recite judicial exceptions from the mental process grouping of abstract ideas. Additionally, the recited limitations that are identified as judicial exceptions from the mathematical concepts grouping of abstract ideas are abstract ideas irrespective of whether or not the limitations are practical to perform in the human mind.
Therefore, claims 1-6, 8-10, and 14-20 recite an abstract idea.
[Step 2A, Prong One: YES]
	Eligibility Step 2A, Prong Two: In determining whether a claim is directed to a judicial exception, further examination is performed that analyzes if the claim recites additional elements that, when examined as a whole, integrates the judicial exception(s) into a practical application (MPEP 2106.04(d)). A claim that integrates a judicial exception into a practical application will apply, rely on, or use the judicial exception in a manner that imposes a meaningful limit on the judicial exception. The claimed additional elements are analyzed to determine if the abstract idea is integrated into a practical application (MPEP 2106.04(d)(I); MPEP 2106.05(a-h)). If the claim contains no additional elements beyond the abstract idea, the claim fails to integrate the abstract idea into a practical application (MPEP 2106.04(d)(III)).
	The judicial exceptions identified in Eligibility Step 2A, Prong One are not integrated into a practical application because of the reasons noted below.
	Claims 1-6, 8, 10, and 15-18 do not recite any elements in addition to the judicial exception, and thus are part of the judicial exception.
	Claims 9 and 19 recite running both the first encoded target bitmap sequence of the target dataset and the second encoded bitmap sequence of said another pair through a bitwise AND operation. The limitation of running encoded bitmap sequences through a bitwise AND operation provides nothing more than mere instructions to implement an abstract idea on a generic computer. See MPEP 2106.05(f). Therefore, the claimed additional elements do not integrate the abstract ideas into a practical application.
	Claims 14 and 20 recite the additional non-abstract element (EIA) of a general-purpose computer system or parts thereof:
a system comprising: a computing device comprising one or more processors and memory configured to store instructions (claim 14);
a graphical user interface configured to present result related to the identified matched segment to a user (claim 14);
a non-transitory computer-readable medium configured to store instructions (claim 20).
The EIA do not provide any details of how specific structures of the computer elements are used to implement the JE. The claims require nothing more than a general-purpose computer to perform the functions that constitute the judicial exceptions. The computer elements of the claims do not provide improvements to the functioning of the computer itself (as in DDR Holdings, LLC v. Hotels.com LP); they do not provide improvements to any other technology or technical field (as in Diamond v. Diehr); nor do they utilize a particular machine (as in Eibel Process Co. v. Minn. & Ont. Paper Co.). Hence, these are mere instructions to apply the JE using a computer, and therefore the claim does not recite integrate that JE into a practical application.
	All limitations in claims 1-20 have been considered as a whole, and are deemed to not recite any additional elements that would integrate a judicial exception into a practical application. Claims 9, 14, 19, and 20 contain additional elements that would not integrate a judicial exception into a practical application and are further probed for inventive concept in Step 2B.
	[Step 2A, Prong Two: NO]
	Eligibility Step 2B: Because the claims recite an abstract idea, and do not integrate that abstract idea into a practical application, the claims are probed for a specific inventive concept. The judicial exception alone cannot provide that inventive concept or practical application (MPEP 2106.05). Identifying whether the additional elements beyond the abstract idea amount to such an inventive concept requires considering the additional elements individually and in combination to determine if they amount to significantly more than the judicial exception (MPEP 2106.05A i-vi).
	The claims do not include any additional elements that are sufficient to amount to significantly more than the judicial exception(s) because of the reasons noted below.
	With respect to claims 14 and 20: The limitations identified above as non-abstract elements (EIA) related to general-purpose computer systems do not rise to the level of significantly more than the judicial exception. These elements do not improve the functioning of the computer itself, or comprise an improvement to any other technical field (Trading Technologies Int’l v. IBG, TLI Communications). They do not require or set forth a particular machine (Ultramercial v. Hulu, LLC., Alice Corp. Pty. Ltd v. CLS Bank Int’l), they do not affect a transformation of matter, nor do they provide an unconventional step. Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception are insufficient to provide significantly more (as discussed in Alice Corp., CyberSource v. Retail Decisions, Parker v. Flook, Versata Development Group v. SAP America).
	The additional element of running both the first encoded target bitmap sequence of the target dataset and the second encoded bitmap sequence of said another pair through a bitwise AND operation (claims 9 and 19) is conventional. Evidence for conventionality is shown by Layer et al. (Nature Methods, 2015, 13(1), 63-65). Layer et al. reviews “For the 24 genotypes given here (3 individuals, 8 genotypes each), the ASCII-base algorithm executes the “if” statement 24 times, while the bit-wise algorithm executes the logical AND (“&”) only three times, with both algorithms producing equivalent results.” (Supplementary Figure 3b, lines 6-8). Also, further reviews “Bitmaps allow for efficient comparisons of many genotypes in a single operation by means of bitwise logical operations (Online Methods, Section “Overview of the GQT genotype-indexing strategy”, paragraph 1, lines 8-9). This shows that bitmap sequence comparisons can be made using bitwise AND operations, which makes it a conventional practice in the art. 
	[Step 2B: NO]
	Therefore, claims 1-20 are patent ineligible under 35 U.S.C. § 101.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-6 and 8-20 are rejected under 35 U.S.C. 103 as being unpatentable over Ball et al. (Discovering genetic matches across a massive, expanding genetic database, 31 March 2016, AncestryDNA Matching White Paper, ancestryDNA, 1-46, https://www.ancestry.com/cs/dna-help/matches/whitepaper), in view of Layer et al. (Nature Methods, 2015, 13(1), 63-65).
With respect to claims 1, 14, and 20:
With respect to the recited identifying one or more segments of a target dataset that match segments of other datasets in a database in claim 1, Ball et al. discloses “The first goal of DNA matching is to accurately identify the DNA segments on the 22 chromosome pairs that are identical-by-descent between pairs of individuals. Importantly, we would like to identify these IBD segments for every pair of customers in our database.” (Pages 5-6, paragraph 2, lines 1-4). DNA segments are identical-by-descent if they are matching segments shared between two or more individuals. Therefore, this suggests that more than one IBD segment is identified in pairs of individuals, which indicates a target dataset and other datasets in a database.
With respect to the recited comparing the pair of encoded target bitmap sequences with other pairs of encoded bitmap sequences to identify homogeneous mismatched locations, the other encoded bitmap sequences generated from the other datasets using the encoding scheme, wherein a homogeneous mismatched location is a location where the target dataset and the other dataset in comparison are both homogeneous but have different types of homogeneity at the location, Ball et al. discloses “For each pair of individuals, identify windows in which the alleles at all SNPs in one of the individual’s two phased haplotypes are identical to all the alleles at the same positions in one of the other individual’s phased haplotypes. We call these “seed matches” (see Figure 3.1, section D).” (Pages 16-17, 3.1. Matching Algorithm, step 2, lines 1-3). Also, further discloses “For each seed match, we attempt to extend the seed match in both directions along the chromosome until (a) the beginning or end of the chromosome is reached, or (b) a homozygous mismatch is detected. A homozygous mismatch is a pair of genotypes at the same SNP that are incompatible regardless of how they are phased (for example, AA and GG). The estimated IBD region is defined by the start and end positions of the SNPs included in the extended segment (see Figure 3.1, section D).” (Page 17, 3.1. Matching Algorithm, step 3, lines 1-7). The windows of two phased haplotypes in one individual compared to a window of two phased haplotypes in another individual suggests a comparison of a pair of encoded target bitmap sequences with other pairs of encoded bitmap sequences as this is done in each pair of individuals. The seed match resulting from this comparison is used to identify homogeneous mismatched locations, where the pairs of genotypes at this location are both homogeneous and have the same alleles, but have different types of homogeneity as described in the example.
With respect to the recited identifying a matched segment between the target dataset and one of the other datasets based on the homogeneous mismatched locations identified, wherein the matched segment is contained within two homogeneous mismatched locations, Ball et al. discloses “For each seed match, we attempt to extend the seed match in both directions along the chromosome until (a) the beginning or end of the chromosome is reached, or (b) a homozygous mismatch is detected.” (Page 17, 3.1. Matching Algorithm, step 3, lines 1-3). Also, further discloses “If the segment is longer than 6 cM, we store that segment as a match in the database.” (Page 17, 3.1. Matching Algorithm, step 5, lines 1-2). The extended seed match indicates the matched segment identified between the target dataset and one of the other datasets. This is based on homogeneous mismatched locations because the extension is cut off when a homozygous mismatch is detected.
Ball et al. does not disclose encoding the target dataset to generate a pair of encoded target bitmap sequences based on an encoding scheme, wherein the target dataset comprises a pair of data value sequences, the encoding scheme defines encoding values based on homogeneity between the pair of data value sequences, and the pair of encoded target bitmap sequences comprises a first encoded target bitmap sequence that encodes a first type of homogeneous locations and a second encoded target bitmap sequence that encodes a second type of homogeneous locations.
However, Layer et al. discloses “For VCF, which encodes diploid genotypes as 0/0 for homozygotes of the reference allele, 0/1 for heterozygotes, 1/1 for homozygotes of the alternate allele and ./. for unknown genotypes (Supplementary Fig. 2a), comparing the genotypes of two or more individuals requires iterative tests of each genotype for each individual.” (Online Methods, Section “Representing sample genotypes with bitmap indices”, paragraph 1, lines 8-13). This demonstrates an encoding scheme where diploid genotypes are the target datasets comprising a pair of data values encoded to generate a pair of bitmap sequences. The scheme defines encoding values based on homogeneity between the pair of data value sequences, where 0/0 for homozygotes of the reference allele represents the first type of homogeneous locations and 1/1 for homozygotes of the alternate allele represents the second type of homogeneous locations.
With respect to claims 2 and 15:
With respect to the recited wherein comparing the pair of encoded target bitmap sequences with the other pairs of encoded bitmap sequences to identify homogeneous mismatched locations, Ball et al. discloses “For each pair of individuals, identify windows in which the alleles at all SNPs in one of the individual’s two phased haplotypes are identical to all the alleles at the same positions in one of the other individual’s phased haplotypes. We call these “seed matches” (see Figure 3.1, section D).” (Pages 16-17, 3.1. Matching Algorithm, step 2, lines 1-3). Also, further discloses “For each seed match, we attempt to extend the seed match in both directions along the chromosome until (a) the beginning or end of the chromosome is reached, or (b) a homozygous mismatch is detected. A homozygous mismatch is a pair of genotypes at the same SNP that are incompatible regardless of how they are phased (for example, AA and GG). The estimated IBD region is defined by the start and end positions of the SNPs included in the extended segment (see Figure 3.1, section D).” (Page 17, 3.1. Matching Algorithm, step 3, lines 1-7). The windows of two phased haplotypes in one individual compared to a window of two phased haplotypes in another individual suggests a comparison of a pair of encoded target bitmap sequences with other pairs of encoded bitmap sequences as this is done in each pair of individuals. The seed match resulting from this comparison is used to identify homogeneous mismatched locations, where the pairs of genotypes at this location are both homogeneous and have the same alleles, but have different types of homogeneity as described in the example.
With respect to the recited sampling the pair of encoded target bitmap sequences to generate a pair of sparse target bitmap sequences, Ball et al. discloses “Subdivide each chromosome into short segments, which we call ‘windows’. In our implementation, all windows contain exactly 96 SNPs.” (Page 16, 3.1. Matching Algorithm, step 1, lines 1-2). Subdividing into short segments suggests sampling bitmap sequences to generate sparse bitmap sequences, which are represented as “windows”.
With respect to the recited comparing the pair of sparse target bitmap sequences to other pairs of sparse bitmap sequences, Ball et al. discloses “For each pair of individuals, identify windows in which the alleles at all SNPs in one of the individual’s two phased haplotypes are identical to all the alleles at the same positions in one of the other individual’s phased haplotypes.” (Pages 16-17, 3.1. Matching Algorithm, step 2, lines 1-3). This suggests that the windows consisting of pairs of sparse segments is compared between pairs of individuals to identify for identical alleles.
With respect to claims 3 and 16:
With respect to the recited wherein identifying the matched segment between the target dataset and one of the other datasets based on homogeneous mismatched locations identified, Ball et al. discloses “For each seed match, we attempt to extend the seed match in both directions along the chromosome until (a) the beginning or end of the chromosome is reached, or (b) a homozygous mismatch is detected.” (Page 17, 3.1. Matching Algorithm, step 3, lines 1-3). Also, further discloses “If the segment is longer than 6 cM, we store that segment as a match in the database.” (Page 17, 3.1. Matching Algorithm, step 5, lines 1-2). The extended seed match indicates the matched segment identified between the target dataset and one of the other datasets. This is based on homogeneous mismatched locations because the extension is cut off when a homozygous mismatch is detected.
With respect to the recited using the comparison between the pair of sparse target bitmap sequences to other pairs of sparse bitmap sequences as a pre-scan to eliminate mismatches, Ball et al. discloses “For each seed match, we attempt to extend the seed match in both directions along the chromosome until (a) the beginning or end of the chromosome is reached, or (b) a homozygous mismatch is detected.” (Page 17, 3.1. Matching Algorithm, step 3, lines 1-3). The seed match resulting from the comparison between pairs of sparse segments is used to eliminate mismatches by extending both directions until a mismatch is detected, which is then excluded from the overall extended seed match segment.
With respect to the recited comparing, responsive to one of the other datasets passing the pre-scan, the target dataset and said one of the other datasets to identify the matched segment, Ball et al. discloses “For each seed match, we attempt to extend the seed match in both directions along the chromosome until (a) the beginning or end of the chromosome is reached, or (b) a homozygous mismatch is detected.” (Page 17, 3.1. Matching Algorithm, step 3, lines 1-3). Also, further discloses “If the segment is longer than 6 cM, we store that segment as a match in the database.” (Page 17, 3.1. Matching Algorithm, step 5, lines 1-2). This seed-and-extend method is happening at the same time as the pre-scan. While the pre-scan filters out the mismatches when they are reached, the extended seed match segment is identified as a matched segment if it is longer than 6 cM.
With respect to claims 4 and 17:
With respect to the recited wherein comparing the pair of encoded target bitmap sequences with one of the other pairs of encoded bitmap sequences to identify the homogeneous mismatched locations, Ball et al. discloses “For each pair of individuals, identify windows in which the alleles at all SNPs in one of the individual’s two phased haplotypes are identical to all the alleles at the same positions in one of the other individual’s phased haplotypes. We call these “seed matches” (see Figure 3.1, section D).” (Pages 16-17, 3.1. Matching Algorithm, step 2, lines 1-3). Also, further discloses “For each seed match, we attempt to extend the seed match in both directions along the chromosome until (a) the beginning or end of the chromosome is reached, or (b) a homozygous mismatch is detected. A homozygous mismatch is a pair of genotypes at the same SNP that are incompatible regardless of how they are phased (for example, AA and GG). The estimated IBD region is defined by the start and end positions of the SNPs included in the extended segment (see Figure 3.1, section D).” (Page 17, 3.1. Matching Algorithm, step 3, lines 1-7). The windows of two phased haplotypes in one individual compared to a window of two phased haplotypes in another individual suggests a comparison between pairs of encoded bitmap sequences. Ball et al. indicates that comparisons are made for each pair of individuals, which encompasses comparisons between one pair of encoded target bitmap sequences and one of the other pairs of encoded bitmap sequences. The seed match resulting from this comparison is used to identify homogeneous mismatched locations, where the pairs of genotypes at this location are both homogeneous and have the same alleles, but have different types of homogeneity as described in the example.
With respect to the recited identifying a seed range of match between the target dataset and another dataset corresponding to said one of the other pairs of encoded bitmap sequences, Ball et al. discloses “For each pair of individuals, identify windows in which the alleles at all SNPs in one of the individual’s two phased haplotypes are identical to all the alleles at the same positions in one of the other individual’s phased haplotypes. We call these “seed matches” (see Figure 3.1, section D).” (Pages 16-17, 3.1. Matching Algorithm, step 2, lines 1-4). This suggests that the windows between pairs of individuals are identified as a seed range of match, where the range depends on the number of alleles that are identical at the same positions.
With respect to the recited comparing the pair of sparse target bitmap sequences with one of the other pairs of sparse bitmap sequences upstream and downstream of the seed range to identify the homogeneous mismatched locations, Ball et al. discloses “For each seed match, we attempt to extend the seed match in both directions along the chromosome until (a) the beginning or end of the chromosome is reached, or (b) a homozygous mismatch is detected.” (Page 17, 3.1. Matching Algorithm, step 3, lines 1-3). The seed matches are the pairs of sparse segments extended in both directions, suggesting upstream and downstream of the seed range. Homogeneous mismatched locations are identified when a homozygous mismatch is detected, stopping the extension of the seed matches.
With respect to claim 5:
With respect to the recited wherein comparing the pair of encoded target bitmap sequences with one of the other pairs of encoded bitmap sequences upstream and downstream of the seed range stops at a threshold range, Ball et al. discloses “For each seed match, we attempt to extend the seed match in both directions along the chromosome until (a) the beginning or end of the chromosome is reached, or (b) a homozygous mismatch is detected.” (Page 17, 3.1. Matching Algorithm, step 3, lines 1-3). This seed-and-extend method indicates a stopping point when the beginning or end of the chromosome is reached or a homozygous mismatch is detected, which suggests a threshold range.
With respect to claim 6:
With respect to the recited wherein identifying a matched segment between the target dataset and one of the other datasets based on the homogeneous mismatched locations identified, Ball et al. discloses “For each seed match, we attempt to extend the seed match in both directions along the chromosome until (a) the beginning or end of the chromosome is reached, or (b) a homozygous mismatch is detected.” (Page 17, 3.1. Matching Algorithm, step 3, lines 1-3). Also, further discloses “If the segment is longer than 6 cM, we store that segment as a match in the database.” (Page 17, 3.1. Matching Algorithm, step 5, lines 1-2). The extended seed match indicates the matched segment identified between the target dataset and one of the other datasets. This is based on homogeneous mismatched locations because the extension is cut off when a homozygous mismatch is detected.
With respect to the recited comparing the pair of encoded target bitmap sequences and a pair of encoded bitmap sequences corresponding to said one of the other datasets location-by-location to identify the homogeneous mismatched locations, Ball et al. discloses “For each seed match, we attempt to extend the seed match in both directions along the chromosome until (a) the beginning or end of the chromosome is reached, or (b) a homozygous mismatch is detected. A homozygous mismatch is a pair of genotypes at the same SNP that are incompatible regardless of how they are phased (for example, AA and GG). The estimated IBD region is defined by the start and end positions of the SNPs included in the extended segment (see Figure 3.1, section D).” (Page 17, 3.1. Matching Algorithm, step 3, lines 1-7). The seed match resulting from the comparison between pairs of segments is used to identify homogeneous mismatched locations by utilizing the described seed-and-extend method. The estimated IBD region defined by start and end positions suggests that the seed match was compared or extended location-by-location to find the homogeneous mismatch locations.
With respect to the recited identifying a candidate segment that is between two homogeneous mismatched locations, Ball et al. discloses “For each seed match, we attempt to extend the seed match in both directions along the chromosome until (a) the beginning or end of the chromosome is reached, or (b) a homozygous mismatch is detected.” (Page 17, 3.1. Matching Algorithm, step 3, lines 1-3). This suggests that the extended seed match is the candidate segment identified between two homogeneous mismatched locations because the seed match is extended on both ends until it reaches a homozygous mismatch.
With respect to the recited determining a length of the candidate segment, Ball et al. discloses “Calculate the length of the candidate matching segment in terms of genetic distance, measured in centimorgans (cM).” (Page 17, 3.1. Matching Algorithm, step 4, lines 1-2). Calculating the length of the candidate matching segment in centimorgans indicates a determination of the length of the candidate segment.
With respect to the recited determining, responsive to the length being larger than a threshold, that the candidate segment is a matched segment, Ball et al. discloses “If the segment is longer than 6 cM, we store that segment as a match in the database.” (Page 17, 3.1. Matching Algorithm, step 5, lines 1-2). This suggests that 6 centimorgans is the threshold length that should be exceeded for the candidate segment to be a matched segment.
With respect to claims 8 and 18:
With respect to the recited wherein comparing the pair of encoded target bitmap sequences with another pair of encoded bitmap sequences to identify homogeneous mismatched locations, Ball et al. discloses “For each pair of individuals, identify windows in which the alleles at all SNPs in one of the individual’s two phased haplotypes are identical to all the alleles at the same positions in one of the other individual’s phased haplotypes. We call these “seed matches” (see Figure 3.1, section D).” (Pages 16-17, 3.1. Matching Algorithm, step 2, lines 1-3). Also, further discloses “For each seed match, we attempt to extend the seed match in both directions along the chromosome until (a) the beginning or end of the chromosome is reached, or (b) a homozygous mismatch is detected. A homozygous mismatch is a pair of genotypes at the same SNP that are incompatible regardless of how they are phased (for example, AA and GG). The estimated IBD region is defined by the start and end positions of the SNPs included in the extended segment (see Figure 3.1, section D).” (Page 17, 3.1. Matching Algorithm, step 3, lines 1-7). The windows of two phased haplotypes in one individual compared to a window of two phased haplotypes in another individual suggests a comparison between pairs of encoded bitmap sequences. Ball et al. indicates that comparisons are made for each pair of individuals, which encompasses comparisons between one pair of encoded target bitmap sequences and another pair of encoded bitmap sequences. The seed match resulting from this comparison is used to identify homogeneous mismatched locations, where the pairs of genotypes at this location are both homogeneous and have the same alleles, but have different types of homogeneity as described in the example.
With respect to the recited identifying a common location that indicates the target dataset and the other dataset in comparison are both homogeneous, Ball et al. discloses “For each seed match, we attempt to extend the seed match in both directions along the chromosome until (a) the beginning or end of the chromosome is reached, or (b) a homozygous mismatch is detected. A homozygous mismatch is a pair of genotypes at the same SNP that are incompatible regardless of how they are phased (for example, AA and GG). The estimated IBD region is defined by the start and end positions of the SNPs included in the extended segment (see Figure 3.1, section D).” (Page 17, 3.1. Matching Algorithm, step 3, lines 1-7). This suggests that the estimated IBD region from extending the seed match is the common location where the target dataset and the other dataset in comparison are both homogeneous. The location is identified between homozygous mismatch positions.
Ball et al. does not disclose comparing the first encoded target bitmap sequence that encodes the first type of homogeneous locations of the target dataset to a second encoded bitmap sequence of said another pair, the second encoded bitmap sequence encoding the second type of homogeneous locations of another dataset.
However, Layer et al. discloses “For VCF, which encodes diploid genotypes as 0/0 for homozygotes of the reference allele, 0/1 for heterozygotes, 1/1 for homozygotes of the alternate allele and ./. for unknown genotypes (Supplementary Fig. 2a), comparing the genotypes of two or more individuals requires iterative tests of each genotype for each individual.” (Online Methods, Section “Representing sample genotypes with bitmap indices”, paragraph 1, lines 8-13). This encoding scheme defines encoding values based on homogeneity between the pair of data value sequences, where 0/0 for homozygotes of the reference allele represents the first type of homogeneous locations and 1/1 for homozygotes of the alternate allele represents the second type of homogeneous locations. These diploid genotypes are being encoded into bitmap sequences for each individual so that the first encoded target bitmap sequence of a target individual is compared to a second encoded bitmap sequence of another individual.
With respect to claims 9 and 19:
Ball et al. does not disclose wherein comparing the first encoded target bitmap sequence that encodes the first type of homogeneous locations of the target dataset to the second encoded bitmap sequence of said another pair.
However, Layer et al. discloses “For VCF, which encodes diploid genotypes as 0/0 for homozygotes of the reference allele, 0/1 for heterozygotes, 1/1 for homozygotes of the alternate allele and ./. for unknown genotypes (Supplementary Fig. 2a), comparing the genotypes of two or more individuals requires iterative tests of each genotype for each individual.” (Online Methods, Section “Representing sample genotypes with bitmap indices”, paragraph 1, lines 8-13). This encoding scheme defines encoding values based on homogeneity between the pair of data value sequences, where 0/0 for homozygotes of the reference allele represents the first type of homogeneous locations and 1/1 for homozygotes of the alternate allele represents the second type of homogeneous locations. These diploid genotypes are being encoded into bitmap sequences for each individual so that the first encoded target bitmap sequence of a target individual is compared to a second encoded bitmap sequence of another individual.
With respect to the recited running both the first encoded target bitmap sequence of the target dataset and the second encoded bitmap sequence of said another pair through a bitwise AND operation, Ball et al. discloses “Hashing avoids explicitly making billions of sequence comparisons. More precisely, we implement a hash function, f(h,w), that maps a character string h and window identifier w to an integer value. It has the property that if two different individuals have identical strings in the same window, they will have the same value of f(h,w).” (Page 21, paragraph 2, lines 5-9).
However, Layer et al. discloses “For the 24 genotypes given here (3 individuals, 8 genotypes each), the ASCII-base algorithm executes the “if” statement 24 times, while the bit-wise algorithm executes the logical AND (“&”) only three times, with both algorithms producing equivalent results.” (Supplementary Figure 3b, lines 6-8).
It would have been prima facie obvious to one of ordinary skill in the art to substitute the bitwise AND operation disclosed by Layer et al. for the hash function disclosed by Ball et al. One would be motivated to make this substitution because bitmaps allow for fast and efficient comparisons of many genotypes in a single operation by means of bitwise logical operations (Online Methods, Section “Overview of the GQT genotype-indexing strategy”, paragraph 1, lines 8-11). There is a likelihood of success, since both the hash function and bitwise AND operation are used to make comparisons between sequences and are well known methods in the art of  computational genomics before the effective filing date of the claimed invention.
With respect to claim 10:
Ball et al. does not disclose wherein the encoding scheme defines that the first encoded target bitmap sequence has a first value if the pair of data value sequences are homogeneous of the first type and has a second value otherwise, and the encoding scheme defines that the second encoded target bitmap sequence has the first value if the pair of data value sequences are homogeneous of the second type and has the second value otherwise.
However, Layer et al. discloses “A bitmap index (bitmap) is an efficient strategy for indexing attributes with discrete values that uses a separate bit array for each possible attribute value. In the case of an individual’s genotypes, a bitmap comprises four distinct bit arrays corresponding to each of the four (including ‘unknown’) possible diploid genotypes. The bits in each bit array are set to true (1) if the individual’s genotype at a given variant matches the genotype the array encodes (Supplementary Fig. 2a). Otherwise, the element is set to false (0).” (Online Methods, Section “Representing sample genotypes with bitmap indices”, paragraph 2, lines 2-10). This bitmap index describes defining the first encoded target bitmap sequence a first value true (1) if the pair of data value sequences are homogeneous of the first type and has a second value false (0) otherwise.
With respect to claim 11:
With respect to the recited wherein the matched segment is an identity-by-descent (IBD) segment between two individuals, Ball et al. discloses “The first goal of DNA matching is to accurately identify the DNA segments on the 22 chromosome pairs that are identical-by-descent between pairs of individuals.” (Page 5, paragraph 2, lines 1-2). Also, further discloses “The estimated IBD region is defined by the start and end positions of the SNPs included in the extended segment (see Figure 3.1, section D).” (Page 17, 3.1. Matching Algorithm, step 3, lines 5-7). This suggests that the matched extended segment identified after the seed-and-extend method is an IBD region, or identity-by-descent segment, between two individuals.
With respect to claim 12:
With respect to the recited wherein the target dataset corresponds to a target DNA dataset of a target individual and the other datasets correspond to other DNA datasets of other individuals, Ball et al. discloses “The first goal of DNA matching is to accurately identify the DNA segments on the 22 chromosome pairs that are identical-by-descent between pairs of individuals. Importantly, we would like to identify these IBD segments for every pair of customers in our database.” (Pages 5-6, paragraph 2, lines 1-4). Also, further discloses “the second step is to identify identical DNA sequences between all pairs of individuals in the customer database.” (Page 8, 1.3. Finding matching segments, paragraph 1, lines 2-4). DNA segments that are identical-by-descent are identified in pairs of individuals, therefore one individual in the pair corresponds to a target individual with a target DNA dataset and another individual corresponds to other customers from the database with other DNA datasets.
With respect to claim 13:
Ball et al. does not disclose wherein the first type of homogeneous locations corresponds to major alleles, the second type of homogeneous locations corresponds to minor alleles, the pair of data value sequences of the target dataset corresponds to a pair of DNA sequences, and the homogeneity between the pair of data value sequences corresponds to homozygosity between the pair of DNA sequences.
However, Layer et al. does disclose “the GQT indexing strategy is fundamentally optimized for questions that involve comparisons of sample genotypes among many variant loci.” (Page 64, col. 2, paragraph 2, lines 1-3). Also, further discloses “For VCF, which encodes diploid genotypes as 0/0 for homozygotes of the reference allele, 0/1 for heterozygotes, 1/1 for homozygotes of the alternate allele and ./. for unknown genotypes (Supplementary Fig. 2a), comparing the genotypes of two or more individuals requires iterative tests of each genotype for each individual.” (Online Methods, Section “Representing sample genotypes with bitmap indices”, paragraph 1, lines 8-13). The Genotype Query Tools (GQT) indexing strategy is used to make comparisons between genotypes in various locations. Therefore, this suggests that encoding the first type of homogeneous locations as 0/0 corresponds to major alleles, and encoding the second type of homogeneous locations as 1/1 corresponds to minor alleles. The diploid genotypes are the target datasets comprising a pair of data values encoded to generate a pair of bitmap sequences. The encoding scheme defines encoding values based on homogeneity between the pair of data value sequences, which subsequently corresponds to homozygosity between the pair of DNA sequences.
Claim 14 recites a graphical user interface configured to present result related to the identified matched segment to a user. This is considered an aesthetic design change as the display is only used for showcasing information output and nothing more. See MPEP 2144.04 (I).
Claim 14 recites a system comprising a computing device comprising one or more processors and memory configured to store instructions. Claim 20 recites a non-transitory computer-readable medium configured to store instructions.
Broadly claiming an automated means to replace a manual function to accomplish the same result does not distinguish over the prior art. See Leapfrog Enters., Inc. v. Fisher-Price, Inc., 485 F .3d 1157, 1161, 82 USPQ2d 1687, 1691 (Fed. Cir. 2007) (“Accommodating a prior art mechanical device that accomplishes [a desired] goal to modern electronics would have been reasonably obvious to one of ordinary skill in designing children’s learning devices. Applying modern electronics to older mechanical devices has been commonplace in recent years.”); In re Venner, 262 F. 2d 91, 95, 120 USPQ 193, 194 (CCPA 1958); see also MPEP § 2144.04. Furthermore, implementing a known function on a computer has been deemed obvious to one of ordinary skill in the art if the automation of the known function on a general purpose computer is nothing more than the predictable use of prior art elements according to their established functions. KSR Int’l Co. v. Teleflex Inc., 550 U.S. 398, 417, 82 USPQ2d 1385, 1396 (2007); see also MPEP § 2143, Exemplary Rationales D and F. Likewise, it has been found to be obvious to adapt an existing process to incorporate Internet and Web browser technologies for communicating and displaying information because these technologies had become commonplace for those functions. Muniauction, Inc. v. Thomson Corp., 532 F.3d 1318, 1326-27, 87 USPQ2d 1350, 1357 (Fed. Cir. 2008).

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Ball et al. (Discovering genetic matches across a massive, expanding genetic database, 31 March 2016, AncestryDNA Matching White Paper, ancestryDNA, 1-46, https://www.ancestry.com/cs/dna-help/matches/whitepaper) and Layer et al. (Nature Methods, 2015, 13(1), 63-65) as applied to claims 1-6 and 8-20 above, and further in view of Seidman et al. (The American Journal of Human Genetics, 2020, 106(4), 453-466).
Ball et al. and Layer et al. are applied to claims 1-6 and 8-20 above.
With respect to claim 7:
Ball et al. and Layer et al. do not disclose wherein the pair of encoded target bitmap sequences are generated from unphased data of the target dataset.
However, Seidman et al. discloses “Here we outline a method, Identical by Descent via Identical by State (IDIS), that efficiently and accurately detects IBD segments and infers degrees of relatedness in both admixed and unadmixed samples. IBIS works by identifying long stretches of allele sharing between samples using unphased genotype data and leverages bit sets to store which markers each sample carries in a homozygous state.” (Page 453, col. 2, paragraph 2, lines 1-7). This suggests that unphased genotype data and bit sets are used to generate encoded target bitmap sequences.
It would have been prima facie obvious to one of ordinary skill in the art to modify the teachings of Ball et al. and Layer et al. with the teachings of Seidman et al. One would be motivated to make this modification because IBIS can reliably identify segments                 
                    ≥
                
             7 centiMorgans (cM) in length and can use these segments to infer sixth degree or closer relatives at comparable accuracy rates to Refined IBD and GERMLINE, two of the most accurate IBD segment-based approaches for relatedness classification (Pages 453-454, paragraph 4, lines 12-17). There is a likelihood of success, since all teachings are methods of identifying IBD segments between individuals and are well known in the art of genomics.

Conclusion
	No claims are allowed.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jammy Luo whose telephone number is (571)272-2358. The examiner can normally be reached Monday - Friday, 9:00 AM - 5:00 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Larry D Riggs can be reached at (571)270-3062. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/J.N.L./Examiner, Art Unit 1686                                                                                                                                                                                                        
/LARRY D RIGGS II/Supervisory Patent Examiner, Art Unit 1686
Read full office action
IDENTIFICATION OF MATCHED SEGMENTED IN PAIRED DATASETS

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

IDENTIFICATION OF MATCHED SEGMENTED IN PAIRED DATASETS

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email