- Search now uses KMD by default instead of dichotomic search (massive speed gain). Many thanks to @Keija for the implementation. Go to #34 for details and benchmarks.
- CSVFile: fixed bug when slice start was None
- CSVFile: Better support for string separator
- AGN SNPs Quality cast to float by importer
- Travis integration
- Minor CSV parser updates
- CSVFile will now ignore empty lines and comments
- Added synonymousCodonsFrequencies
- It is no longer mandatory to set the whole legend of CSV file at initialization. It can figure it out by itself
- Datawraps can now be uncompressed folders
- Explicit error message if there's no file name manifest.ini in datawrap
- Fixed BUG that prevented proper initialization and importation
- BUG FIX: Opening a lot of chromosomes caused mmap to die screaming
- Removed core indexes. Sqlite sometimes chose them instead of user defined positional indexes, resulting un slow queries
- Doc updates
- Added functions to retrieve the names of imported snps sets and genomes
- Added remote datawraps to the boostrap module that can be downloaded from pyGeno's website or any other location
- Added a field uniqueId to AgnosticSNPs
- Changed all latin datawrap names to english
- Removed datawrap for dbSNP GRCh37
- Updated pypi package to include bootstrap datawraps
- Fixed tests
- BUG FIX: get()/iterGet() now works for SNPs and Indels
- BUG FIX: Default SNP filter used to return unwanted Nones for insertions
- BUG FIX: Added cast of lines to str in VCF and CasavaSNP parsers. Sometimes unicode caracters made the translation bug
- BUG FIX: Corrected a typo that caused find in Transcript to recursively die
- Added a new AgnosticSNP type of SNPs that can be easily made from the results of any snp caller. To make for the loss of support for casava by illumina. See SNP.AgnosticSNP for documentation
- pyGeno now comes with the murine reference genome GRCm38.78
- pyGeno now comes with the human reference genome GRCh38.78, GRCh37.75 is still shipped with pyGeno but might be in the upcoming versions
- pyGeno now comes with a datawrap for common dbSNPs human SNPs (SNP_dbSNP142_human_common_all.tar.gz)
- Added a dummy AgnosticSNP datawrap example for boostraping
- Changed the interface of the bootstrap module
- CSV Parser has now the ability to stream directly to a file
- BUG FIX: looping through CSV lines now works
- Added tests for CSV
- BUG FIX: find in BinarySequence could not find some subsequences at the tail of sequence
- BUG FIX in default SNP filter
- Updated description
- Another BUG FIX in progress bar
- Small BUG FIX in the progress bar that caused epochs to be misrepresented
- 'Specie' has been changed to 'species' everywhere. That breaks the database the only way to fix it is to redo all importations
- Genome import is now much more memory efficient
- BUG FIX: find in BinarySequence could not find subsequences at the tail of sequence
- Added a built-in datawrap with only chr1 and y
- Readme update with more infos about importation and link to doc
Easier and much morr coherent:
- SNP filtering has now its own module
- SNP Filters are now objects
- SNP Filters must return SequenceSNP, SNPInsert, SNPDeletion or None objects
Freshly hatched