丁香实验_LOGO
登录
提问
我要登录
|免费注册
点赞
收藏
wx-share
分享

DNA walk

互联网

1791
 

 

DNA walk

A DNA walk of a genome represents how the frequency of each nucleotide of a pairing nucleotide couple changes locally. This analysis implies measurement of the local distribution of Gs in the content of GC and of Ts in the content of TA. Lobry was the first to propose this analysis ( 1996 , 1999 ). Two complementary representations can be derived from the DNA walk: the cumulative TA- and the GC-skew analysis. 

Aim : By reading these description of the algorithm, a reader not trained in genomics is able to redraw our graphs, using the basic genometric data file that is posted on our web resource for each organism as a zip file (.zip).

  DNA walk

1) Drawing a DNA walk by reading a sequence file nucleotide by nucleotide.

A simple algorithm is used to draw a DNA walk by simply assigning a direction to each nucleotide. We propose the following assignment, slightly different from Lobry's: to T, C, A, and G correspond the E(ast), S(outh), W(est), and N(orth) directions, respectively (Lobry, 1999). Reading the nucleotide sequence nucleotide by nucleotide, and following the rule, a path clearly emerges on the graph: Figure 1.

 

Figure 1 : DNA walk of the sequence

  GTCTGGTGTCTGGAGTTCCTGGGTCTTGAG ACCACAGGACCCACCAGGGACCCAGGACCC

Starting from the bottom left (bold blue line), the curve end at the bottom left (pink line)

2) Drawing a DNA walk by slicing a sequence file nucleotide into small windows

A simple way to draw quickly this kind of graph is suggested by Lobry (1996) by cutting a genome into windows of equal length.

 

Figure 2 : DNA walk of the same sequence as the one presented in Figure 1: GTCTGGTGTCTGGAGTTCCTGGGTCTTGAG ACCACAGGACCCACCAGGGACCCAGGACCC

The sequence was sliced into 5-nucleotide windows. Only the fifth nucleotide per window is plotted. We can also work with the mean values of the window…

Comment : this method is not as precise as the first one. We could use it with a spreadsheet software without affecting the final resolution of the curve at the genome level.

 

2.1) The genome is cut into a number n of windows W, of equal size (the last window being smaller or equal to the other ones).

 

W 1
W 2
W 3
...
...
W n-1
W n

2.2) In each of these windows a count for each nucleotide is performed: cA , cC , cG , and cT respectively.

 

W1

cA 1

cC 1

cG 1

cT 1

W2

cA 2

cC 2

cG 2

cT 2

W3

cA 3

cC 3

cG 3

cT 3

...

...

...

...

...

...

...

...

...

...

Wn-1

cA n-1

cC n-1

cG n-1

cT n-1

Wn

cA n

cC n

cG n

cT n

Example: Mycoplasma genitalium genome ( download the compressed text file ), cut into windows of 1000 nucleotides.
( Mycoplasma genitalium G37 complete genome, L43967.1, 580074 bp, window: 1000 bp).

 

Center position

 

 

 

 

Position of the window center (nt)

cA

cC

cG

cT

500

453

93

86

368

1500

400

120

133

347

2500

374

122

164

340

3500

345

145

200

310

...

...

...

...

...

...

...

...

...

...

578500

313

138

141

408

579500

318

149

145

388

580037

33

8

4

29

2.3) Two calculations are performed for each window: x i and y i are determined.

 

W1

cA1

cC1

cG1

cT1

x 1=cT1-cA1

y 1=cG1-cC1

W2

cA2

cC2

cG2

cT2

x 2=cT2-cA2

y 2=cG2-cC2

...

...

...

...

...

...

...

Wn

cAn

cCn

cGn

cTn

x n=cTn-cAn

y n=cGn-cCn

2.4) A cumulative curve is calculated : X i and Y i are determined.

 

W1 ...

x1=cT1-cA1

y1=cG1-cC1

X 1=sum(x1 to x1)

Y 1=sum(y1 to y1)

W2 ...

x2=cT2-cA2

y2=cG2-cC2

X 2=sum(x1 to x2)

Y 2=sum(y1 to y2)

...

...

...

...

...

Wn ...

xn=cTn-cAn

yn=cGn-cCn

X n=sum(x1 to xn)

Y n=sum(y1 to yn)

2.5) A cumulative curve is drawn by respecting the order of data, from X1 to Xn and by assigning to Xi the value of Yi.

2.6) According to the previous description the DNA walk was written like this on our graphs, generated by the method "nucleotide by nucleotide":

TmAc vs GmCc meaning that in x is plotted the cumulation of numbers of T s m inus numbers of A s vs in y the cumulation of numbers of G s m inus numbers of C s.

Lobry has chosen to use this assignment: T, G, A, and C correspond to E, S, W, and N directions, respectively. Lobry's outputs are similar to ours (mirror images along the X axis). Compare the DNA walk of Borrelia burgdorferi in Lobry's drawing system and ours.

 

Lobry's system

 

 

our system

Figure 3 : DNA walk of Borrelia burdorferi

<center> <p> </p> </center>
上一篇:Cell-Support:Immune Therapy Strengthen Your Immune System   下一篇:琼脂糖凝胶回收
提问
扫一扫
丁香实验小程序二维码
实验小助手
丁香实验公众号二维码
关注公众号
反馈
TOP
打开小程序