Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coverage from median k-mer count -> coverage on nucleotide level #883

Open
qingpeng opened this issue Mar 26, 2015 · 1 comment
Open

coverage from median k-mer count -> coverage on nucleotide level #883

qingpeng opened this issue Mar 26, 2015 · 1 comment
Milestone

Comments

@qingpeng
Copy link
Contributor

The median k-mer count of a read is smaller than the actual sequencing depth of the region this read represents. We have seen this before, like in the diginorm paper and the analysis of IGS method. I am not sure if we have discussed this, actually there is a way to convert such median k-mer count/ read coverage into real sequencing depth(coverage on nucleotide level).

It is mentioned in the Quake paper.
"Note that the expected coverage of a k-mer in the genome using reads of length L will be (L-k 1)/L times the expected coverage of a single nucleotide because the full k-mer must be covered by the read. "

http://genomebiology.com/2010/11/11/R116/figure/F3

I guess this has some influence to our method.

For the IGS method, now it is easier to explain the concept. The size of a IGS is exactly the length of read now, just the abundance of IGS needs to be adjusted into "real" coverage, rather than coverage from median k-mer abundance.

For diginorm, when we use normalize-by-median.py with parameter "-C" to set the "desired coverage", this actually means the coverage from median k-mer abundance, which is not the coverage based on nucleotide level.

Practically I don't think this is a big deal. But it may be worthy of some discussions.
A potential feature to consider is that we may use the "real" coverage as an optional argument. The conversion is straightforward, as shown in the formula above.

@ctb
Copy link
Member

ctb commented Mar 26, 2015 via email

@mr-c mr-c added this to the unscheduled milestone Jul 30, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants