I have the following tsv file, which I am trying to read and save the information from it separately.
Here an example of two lines of the file :
Extract of the file
13->7 3 270296:[T]1132070:[T]2807979:[T]
12->8 31 73108:[G]119227:[T]210429:[T]237902:[T]490699:[A]588160:[T]730687:[A]863532:[T]953590:[T]1207654:[T]1270425:[C]1315919:[C]1374547:[C]1787551:[C]1872033:[G]1963836:[T]2112830:[A]2183936:[A]2464064:[T]2573449:[T]2594098:[T]2667677:[C]2815676:[T]2926565:[T]3019188:[T]3023991:[A]3097403:[A]3142179:[A]3180137:[C]3254219:[G]3265026:[G]
As you can see, each line has a different amount of the last group. I have tried the following code, but it only saves the first group:
Draft of the code:
var x = str.split('\n');
var regex = /([0-9]+)\t([0-9]+)\t(([0-9]+):.([ACGTN]).)+/g;
for (var i=0; i<x.length; i++) {
line = regex.exec(x[i]);
console.log(line);
//Example for the first line
//line[1] = 7
//line[2] = 3
//line[3] = 270296:[T]
//line[4] = 270296
//line[5] = T
//that's it
}
My expected output is that each of the NUM:[LETTER]
appears either together in a cell of the array (like in line[3]) or already separated, like in line[4] and line[5].
Output draft
Idea 1:
line[3] = 270296:[T]
line[4] = 1132070:[T]
line[5] = 2807979:[T]
Idea 2
line[3] = 270296
line[4] = T
line[5] = 1132070
line[3] = T
line[4] = 2807979
line[5] = T
Any ideas what I have been missing to obtain this mentioned output?
If I were doing this, I would break the regex into two pieces — one for the first two numbers and one for the data — to make it easier to understand late. Something like:
var line = '8 31 73108:[G]119227:[T]210429:[T]237902:[T]490699:[A]588160:[T]730687:[A]863532:[T]953590:[T]1207654:[T]1270425:[C]1315919:[C]1374547:[C]1787551:[C]1872033:[G]1963836:[T]2112830:[A]2183936:[A]2464064:[T]2573449:[T]2594098:[T]2667677:[C]2815676:[T]2926565:[T]3019188:[T]3023991:[A]3097403:[A]3142179:[A]3180137:[C]3254219:[G]3265026:[G]'
// get the numers and the rest
let [num1, num2, data] = line.split(/\s+/g)
// parse the rest to an array
data = data.match(/([0-9]+:\[[ACGTN]\])/g)
console.log(num1, num2, data)
From here if you needed further processing, for example making an array of objects from your data, it should be easy.
// array of objects like [{'73108': '[G]'}, ...]
let objArray = data.map(n => {
let [key, value] = n.split(':')
return {[key]:value}
})