I have this program to generate random N sequences.
import random
N = 5
def randseq(abc, length):
return "".join([random.choice(abc) for i in range(random.randint(1, length))])
for i in range(N):
print(f'Sequence {i+1}:')
print(randseq("ATCG", 120))
I got the sequences
Sequence 1:
TGGTACACGTGCTTAATGTTAACCTGTCTGGCGCAGGGTAACTATTTCATCCCT
Sequence 2:
CGTATATAATGCTTCCTCTTCAGGCGACCTTGCGATAGTGTCCGGCCATGTGAGTCCCTGTGGAGTGCCTTTAGATGACCTATACGTCTTTAGACTATGTTTATGGGG
Sequence 3:
CACAGCCTTCCTCCAATG . . .
Sequence N:
How can I print the longest and shortest N sequences and their lengths?
....
Please check on my code. The descriptions are inside there.
import random
def randseq(abc, length):
return "".join([random.choice(abc) for i in range(random.randint(1, length))])
# You should move the input value to the main part of code
# If not, it will treat as global variable
N = 5
# Init the longest seq with shortest one (empty string)
# to make sure that all random seq must longer than this init
longest_seq = ""
# Init the shortest seq with longest one
# (assume that randseq("ATCG", 1000) is long enough)
# to make sure that all random seq must shorter than this init
shortest_seq = randseq("ATCG", 1000)
for i in range(N):
print(f'Sequence {i+1}:')
seq = randseq("ATCG", 120)
# Find the longest one then update it to the longest_seq variable
if len(seq) > len(longest_seq):
longest_seq = seq
# Find the shortest one then update it to the shortest_seq variable
if len(seq) < len(shortest_seq):
shortest_seq = seq
print(seq)
print("")
print('The longest seq is ', longest_seq)
print('The lenght of longest seq is ', len(longest_seq))
print('The shortest is ', shortest_seq)
print('The lenght of shortest seq is ', len(shortest_seq))
Example result (it's random, so it will not same as you when you run it)
Sequence 1:
CGGTGATCGCGATTACTGCCCGGCCTTGTCCACTCACAGCGATAACAGTGCTTATAGATCTCTCAAGTCTACCGTCTCACCCGTTGATTACCAA
Sequence 2:
AAGGTCAAGATTCGAATTCGTATCGCCGTATGGATAGGCGAAACGAGGGGTGGCTAAGGGGTAGACAGCAGAGCCGCTTTTGTACACCGTAAAACGGACGGTTCAGAACCGGAGGTACG
Sequence 3:
ACGGCCTCATGGATAATGCCCGGGGGAACAGGGAAGGAAAGATTTTGTCAAACTGATTCAGTTAC
Sequence 4:
GATACA
Sequence 5:
ATCGAAAGGAATATCTGTACGGGACGTTTGGTCTCGAGCCTAGCGTAAGCCGCCCGCAATTCGCTCTGATGAGCTACCG
The longest seq is AAGGTCAAGATTCGAATTCGTATCGCCGTATGGATAGGCGAAACGAGGGGTGGCTAAGGGGTAGACAGCAGAGCCGCTTTTGTACACCGTAAAACGGACGGTTCAGAACCGGAGGTACG
The lenght of longest seq is 119
The shortest is GATACA
The lenght of shortest seq is 6
Precaution:
In some (rarely) case, the initialization of shortest_seq might be too small (smallest among all random seq). If this case occur, the program will be failed. You can increase the length of randseq input to reduce the possibility to encounter with this problem.
For example.
You can change it from:
shortest_seq = randseq("ATCG", 1000)
to:
shortest_seq = randseq("ATCG", 10000)