Tags · raphael-group/hatchet

v2.1.0

pinning numpy<2 (#230)

* pinning numpy<2
* added recent pythons to build matrix
* going with ruff for linting/formatting etc

Oct 15, 2024
7115c38
zip
tar.gz
Notes

v2.0.1

Update pyproject.toml (#203)

Dec 18, 2023
07602ae
zip
tar.gz
Notes

v2.0.0

Add mirrored haplotype frequency inference (#181)

* Fix bug with phase_snps and chrnotation

* Allow mhBAF values above 0.5

* Fixed genotype_snps error message

* Set merging to 0 by default just to make sure

* Update default value in ArgParsing

* Repeat clustering 10 times per K

* Added 'restarts' option to cluster_bins

* Add sample subsetting to cluster-bins

* Add warnings for low tau

* Updated plotting to respect mhBAF

* diploidbaf now requires all samples to be close to 0.5

* Update plotting for mhbaf

* Fix count_alleles error in error msg

* Update compute_cn 'clonal' argument parsing

* Minor update to docstring

* Add -pthread flag to CMakeLists

* Fix issue with sexchroms in combine_counts_fw

* Update test data for mirror/mhbaf

* Trim trailing whitespace

* Formatting and unused imports

* More formatting

* Add correction for haplotype switching on chromosome arms

* Update default silhouette score to avoid no-solution issues

* Update tests (add new bb column for haplotype correction)

* Source file formatting

* Minor change to code formatting

* Bump minor version number to 1.2

* Count haplotype switches using imbal. samples only

* Fix doubled log messages

Jan 25, 2023
d3b38e4
zip
tar.gz
Notes

v1.2.0

Add mirrored haplotype frequency inference (#181)

* Fix bug with phase_snps and chrnotation

* Allow mhBAF values above 0.5

* Fixed genotype_snps error message

* Set merging to 0 by default just to make sure

* Update default value in ArgParsing

* Repeat clustering 10 times per K

* Added 'restarts' option to cluster_bins

* Add sample subsetting to cluster-bins

* Add warnings for low tau

* Updated plotting to respect mhBAF

* diploidbaf now requires all samples to be close to 0.5

* Update plotting for mhbaf

* Fix count_alleles error in error msg

* Update compute_cn 'clonal' argument parsing

* Minor update to docstring

* Add -pthread flag to CMakeLists

* Fix issue with sexchroms in combine_counts_fw

* Update test data for mirror/mhbaf

* Trim trailing whitespace

* Formatting and unused imports

* More formatting

* Add correction for haplotype switching on chromosome arms

* Update default silhouette score to avoid no-solution issues

* Update tests (add new bb column for haplotype correction)

* Source file formatting

* Minor change to code formatting

* Bump minor version number to 1.2

* Count haplotype switches using imbal. samples only

* Fix doubled log messages

Jan 25, 2023
d3b38e4
zip
tar.gz

v1.1.1

Vb/multiprocessing bugfix (#161)

* Bug fix in count_alleles; Fix to run demo end-end on specific chromosome(s)
* bug fix with mutiprocessing handler IDs

Aug 30, 2022
611bfbd
zip
tar.gz
Notes

v1.1.0

Bug fix in count_alleles; Fix to run demo end-end on specific chromos…

…ome(s) (#158)

* Bug fix in count_alleles; Fix to run demo end-end on specific chromosome(s)

Aug 28, 2022
71c7da7
zip
tar.gz

v1.0.3

Merge pull request #155 from raphael-group/vb/hotfix_errormsg

Bug fix with the error() function

Aug 19, 2022
03fe4cc
zip
tar.gz
Notes

v1.0.2

Fix bug w/ argument order in run.py phase_snps (#148)

* Fix bug w/ argument order in run.py phase_snps

* using Worker for count_alleles to further debug issue 150 (#151)

Co-authored-by: Vineet Bansal <[email protected]>

Aug 1, 2022
9832b01
zip
tar.gz
Notes

v1.0.1

Rename clustering commands so that cluster-bins is new functionality (#…

…144)

* Rename old cluster_bins to cluster_bins_gmm and _loc to cluster_bins
* Update docs for renamed clustering
* bumped version

Co-authored-by: Vineet Bansal <[email protected]>

Jul 3, 2022
9925e11
zip
tar.gz
Notes

v1.0.0

HATCHet 1.0 (#136)

* Added array command to generate intermediate input to abin

* Reindex clusters to be in [1, n_clusters]

* Added X and Y handling to chromosome sorting

* Removed tracemalloc calls

* Handle X Y chromosomes

* Write #CHR column always

* BB file is now exclusively autosomes

* created running Phase.py

* passed args to Phase.py

* Added total read correction to abin

* retreive 1000GP tarball, extract

* set up multiprocessing template

* added shapeit

* concat phased vcfs

* added log to see proportion of phased SNPs

* Bugfix with SNP overlapping centromere boundary

* Added min cluster size option and outlier removal to kdeBB

* remove intermediate files

* phase with chr prefix incompatible with refpanel

* phase with chr prefix, add chr back

* simplified renaming of chroms and snplist dict in ArgParse

* added liftover before and after phasing

* separates file download and phasing

* post-testing cleanup

* comBBo now reads gzipped phased vcfs

* Added centromeres txt file to resources

* Added formArray to preprocessing

* adaptiveBin working with only intermediate arrays (no counts)

* Make .tar.gz compression default behavior

* countPos only runs if array output not present

* Use 5kB intervals as candidate bin thresholds for sex chroms

* Remove bins with outlier RD for KDE step only

* add logArgs to cluBB

* Added .vscode to gitignore

* Removed duplicated preprocessing section from ini

* Added phasing to adaptive binning command

* Remove array forming code from abin

* Remove typo affecting ALPHA column

* Add output directory if it doesn't exist

* Update reference to config file

* Change array-abin interface to use directory instead of filename stem

* phasePrep downloads all files

* uncommented scripts

* Create doc_count_reads_fw

* Rename doc_count_reads_fw to doc_count_reads_fw.md

* Update doc_count_reads_fw.md

* Update doc_count_reads_fw.md

* Create doc_combine_counts_fw.md

* Update doc_combine_counts_fw.md

* Update doc_count_reads_fw.md

* abin removes intermediate BB files after merge

* Removed reference to obselete ini section

* Added and commented out code to skip sex chromosomes

* Updated calls/output files for array

* Skip sex chroms when phasing

* Add hg19 centromeres file to resources

* Explicit references to centromere files in setup

* Combine formArray and countPos in count_reads

* Added total read counting

* Adjusted interface for abin command, fixed sex chrom dummy SNPs

* renamed phasing commands and integrated into run.py

* Renamed abin commands

* Better error message for missing output dir

* Update count_reads.md to match adaptive binning

* Add usage example

* Update doc_combine_counts.md for adaptive binning

* Added ref genome argument to combine counts doc

* Update doc_combine_counts.md

* Cleaned up order of arguments

* ran Gundem_A17 single sample with abin via run.py

* added --use_em to run.py

* Abin bugfix, kde bugfix, kde with scaling

* Fix typo in print message

* Updated _fw command paths in main

* Allowed users to manually specify clusters with high base CN

* Update bin start indices (now half-open at end)

* At least 2 processes for count_reads

* Sort chromosomes in combine_counts input

* Fix indexing, use all samples for phasing blocks

* Update diploidbaf default value in doc to match code

* Change "use_em" to "use_mm" to match code

* Add variable-width binning and swap count-alleles and count-reads commands

* Moved "count_reads" after "count_alleles" for consistency

* Update run_hatchet for adaptive binning

* Remove unused section and fix typo

* Update doc_runhatchet.md

* Updated demo-complete for adaptive binning

* Bugfixes to count_reads_fw

* Added fixed-width flag

* Changed EM to be default instead of MM

* Added recommendations for variable-width bins

* Updated tests for reindexing and _fw cmd names

* Added abin test script and output

* Added phasing and abin dependencies

* Remove conda dependencies from setup.py

* Abin phase test script

* Typo 'phases'

* Update help text for phasing file

* Added --seed to shapeit calls

* Added abin and phasing dependencies to installation

* Remove KDE clustering

* Clean up master diff

* Add abin/phase dependencies to GA build

* Fix unzip and change permissions GH Actions

* set picards WARN_ON_MISSING_CONTIG to true

* Fixed phasing alt contigs

* Update function description

* Broke out abin/phase tests into parts, added checks for dependencies

* Add testscript back (for now) for coverage on fixed-width commands

* Add test SNP data

* Add check for shapeit and picard in argparsing

* Update test bbc to match cluster IDs

* Add back best.bbc.ucn with new clusters

* Ignore phase argument when phasefile not present

* Handle case of <=1 bin in arm

* Skip MSR check when arm has only 1 bin

* Support pyomo solving without building cpp

* Remove call to config.has_option

* Undo changes to run.py

* Added arguments to specify dependency paths

* Add locality clustering command

* Implemented multi-sample BAF inference

* Added option for split-bin objective in Python solver

* Add loc-clust switch in run.py, fixed-width bugfix

* Create doc_cluster_bins_loc.md

* Update doc_cluster_bins_loc.md

* Bugfix for combine_counts arguments

* Sort clusters by size for consistency

* Phase_snps now starts -j workers (instead of -j/3 to work around pipes)

* Add note about locality clustering to pipeline doc

* Updated hatchet.ini for adaptive binning and locality clustering

* Added args: readquality for count-reads, bgzip for phase-snps

* Add hmmlearn as dependency in setup.py

* Fixed bug with EM likelihood output

* Update error messages

* Added 1D and 2D plotting as addnl command

* Typos and bugfix in num. processes

* Revert change to EM function

* Fixed bug in forming SNP count arrays without phasing

* Added intermediates-only option to count-reads

* Ability to use picard from conda or from source; memory flags

* Update hatchet.ini

* Removed hardcoded memory override for picard invocation

* Make 'path' argument names match tool names

* Removed dead code, updated end for last bin per arm (will break tests)

* Remove duplicate .ini section

* Update combine-counts test cases

* Handle singleton cluster (K=1)

* Add tests for locality clustering

* Update fw command names in test, clustered files with new indexing

* Fix off-by-1 error in read counting

* Style cleanup following Vineet's PR comments

* Move contents of 'resources' to 'data' and update 'combine-counts' test data

* Remove commented code

* Remove skipif statements for dependencies

* Cleaned up download_panel arguments

* Add 1D/2D cluster plotting with consistent colors

* New plotting command (missed in prev commit)

* Make plot filenames use same convention

* Add option to show centromeres in 1D/2D copy-number plots

* Remove commented memory profiling code

* Remove explicit handling of sex chromosomes in genotype_snps

* Remove old commands (preprocess and kde clustering

* Make new features default

* Clean up plot issues (many clones, sample titles)

* Use standard subdirectories e.g. /bb

* Ignore test data files for Linguist annotation

* Correct REF and ALT column names when reading baf file

* Moved fixed-length bin test data into 'fl' subdirectory

* Moved variable-length and phasing test data to 'v' subdir

* Combine variable-length and loc-clust tests in 1 script

* Update paths for test_phase.py

* not LFS-tracking any files anymore (no large files included in repo)

* gitattributes revert

* added sample bam file as untracked raw file; additional checks

* Abin refactor (#132)

A bunch of refactorings for readability/faster tests

* sorting multiprocessing results by handler IDs (as they're introduced in the queue), not the results

* checking for cached reference/panel files

* changing cache key to force invalidation

* bug fix in picard invocation

* Moved arg checks to ArgParsing and fixed panel dir check

* Removed check for large files to avoid git-lfs errors

* Applied picard change to 'check' command

* Fix for CI tests (#134)

Fix for CI tests

* compute-cn now uses 'clonal' argument in diploid case

* Update doc_cluster_bins.md

* Update doc_cluster_bins.md

* Update doc_cluster_bins_loc.md

* Vb/pyproject (#135)

some checks from scikit-hep diagnostics

* CI run on master/develop

* trying reduced phasing data for CI (#137)

* Turned merging in compute_cn off by default

* Updated handling of 'clonal' argument

* more checks; cached checks; iterating through all supported commands (#140)

* All 1D plots now have grid off and centromeres by default

* Updated documentation

* Added clarification for new features

* Update arguments in count_reads doc

* Update download_panel description

* Updated full demo and added notes to others

* Added plotting docs and updated recommendations

* Update docs for new plotting commands

* Use regex=False instead of escape char for clarity

* Fixed typo noted in issue #98

* Update .gitignore with new test data paths

* Changed 'length' to 'width' in test files and data

* Add caveat to check doc

* Fixed chrnotation bug in run.py for phasing

* Vb/checkbetter (#143)

* more checks; cached checks; iterating through all supported commands
* Added check for bgzip under the phase-snps command

* Restore default merging behavior and update doc values

* Update docs and add genotype_snps doc

* Updated count_alleles doc and removed unused arg

* Fix issue with removed argument

* Protecting check-solver against exceptions so it can be used safely with HATCHet check

* Update docs

* not failing fast

* bumped hash key

Co-authored-by: Matt Myers <[email protected]>
Co-authored-by: Brian J. Arnold <[email protected]>
Co-authored-by: Brian Arnold <[email protected]>

Jul 1, 2022
48faa92
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.1.0

v2.0.1

v2.0.0

v1.2.0

v1.1.1

v1.1.0

v1.0.3

v1.0.2

v1.0.1

v1.0.0

Tags: raphael-group/hatchet