Replace a list of characters with indices in a string in python
Instead of string concatenation (wich is wasteful due to created / destroyed string instances), use a list:
coordinates = [[1,5], [10,15], [25, 35]] # sorted
line = 'ATCACGTGTGTGTACACGTACGTGTGNGTNGTTGAGTGKWSGTGAAAAAKCT'
result = list(line)
# opted for exclusive end pos
for r in [range(start,end) for start,end in coordinates]:
for p in r:
result[p]='N'
res = ''.join(result)
print(res)
To get:
ANNNNGTGTGNNNNNACGTACGTGTNNNNNNNNNNGTGKWSGTGAAAAAKCT
optimized to use slicing and exclusive end
:
for start,end in coordinates:
result[start:end] = ["N"]*(end-start)
res = ''.join(result)
print(line)
print(res)
gives you your wanted output:
ATCACGTGTGTGTACACGTACGTGTGNGTNGTTGAGTGKWSGTGAAAAAKCT
ANNNNGTGTGNNNNNACGTACGTGTNNNNNNNNNNGTGKWSGTGAAAAAKCT
Good question, this should work.
coordinates = [[1,5], [10,15], [25, 35]]
line = 'ATCACGTGTGTGTACACGTACGTGTGNGTNGTTGAGTGKWSGTGAAAAAKCT'
for L,R in coordinates:
line = line[:L] + "N"*(R-L) + line[R:]
print(line)
You may need to adjust this depending on how the coordinates are defined, eg. inclusive/1-indexed.
We need more people working with DNA, so great work.