EnumerateNullomers


Extracts all nullomers, and enumerates all present kmers, of a specified kmer length from a fasta sample.



Required arguments:

Argument

Explanation

–genome_filepath

Path to the fasta file containing the genome being analyzed.*

–nullomer_output_filepath

Path to the output .txt file where the nullomers absent from the supplied genome will be written, with one nullomer written on each line.

–kmer_output_filepath

Path to the output .tsv file where the kmers, and their corresponding occurrence counts, from the supplied genome will be written.

–kmer_length

Length of kmers/nullomers to be enumerated.



Outputs:

  • Text file containing a list of nullomers absent from the supplied genome file.

  • Text file containing every kmer occuring at least once in the genome file supplied, accompanied by a count of how often each occur


Note

* This file must be formatted such that the FASTA headers read “>chr1”, “>chr2”, “>chr3”, …