Here each instance of the class DNA
corresponds to a string such as 'GCCCAC'
. Arrays of substrings containing k-mers can be constructed from these strings. For this string there are 1-mers, 2-mers, 3-mers, 4-mers, 5-mers and one 6-mer:
["G", "C", "C", "C", "A", "C"]
["GC", "CC", "CC", "CA", "AC"]
["GCC", "CCC", "CCA", "CAC"]
["GCCC", "CCCA", "CCAC"]
["GCCCA", "CCCAC"]
["GCCCAC"]
The pattern should be evident. See the Wiki for details.
The problem is to write the method shared_kmers(k, dna2) of the DNA class which returns an array of all pairs [i, j] where this DNA object (that receives the message) shares with dna2 a common k-mer at position i in this dna and at position j in dna2.
dna1 = DNA.new('GCCCAC')
dna2 = DNA.new('CCACGC')
dna1.shared_kmers(2, dna2)
#=> [[0, 4], [1, 0], [2, 0], [3, 1], [4, 2]]
dna2.shared_kmers(2, dna1)
#=> [[0, 1], [0, 2], [1, 3], [2, 4], [4, 0]]
dna1.shared_kmers(3, dna2)
#=> [[2, 0], [3, 1]]
dna1.shared_kmers(4, dna2)
#=> [[2, 0]]
dna1.shared_kmers(5, dna2)
#=> []
class DNA
attr_accessor :sequencing
def initialize(sequencing)
@sequencing = sequencing
end
def kmers(k)
@sequencing.each_char.each_cons(k).map(&:join)
end
def shared_kmers(k, dna)
kmers(k).each_with_object([]).with_index do |(kmer, result), index|
dna.kmers(k).each_with_index do |other_kmer, other_kmer_index|
result << [index, other_kmer_index] if kmer.eql?(other_kmer)
end
end
end
end
dna1 = DNA.new('GCCCAC')
dna2 = DNA.new('CCACGC')
dna1.kmers(2)
#=> ["GC", "CC", "CC", "CA", "AC"]
dna2.kmers(2)
#=> ["CC", "CA", "AC", "CG", "GC"]
dna1.shared_kmers(2, dna2)
#=> [[0, 4], [1, 0], [2, 0], [3, 1], [4, 2]]
dna2.shared_kmers(2, dna1)
#=> [[0, 1], [0, 2], [1, 3], [2, 4], [4, 0]]
dna1.shared_kmers(3, dna2)
#=> [[2, 0], [3, 1]]
dna1.shared_kmers(4, dna2)
#=> [[2, 0]]
dna1.shared_kmers(5, dna2)
#=> []