Conda install error #27

wangzhenzZ · 2024-10-31T02:40:02Z

Hello, I'm using conda to install genEra. But I got this error.

$ conda install -c bioconda genera
Channels:
 - bioconda
 - defaults
 - conda-forge
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

  - genera

Current channels:

  - https://conda.anaconda.org/bioconda
  - defaults
  - https://conda.anaconda.org/conda-forge

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

And I didn't find the package in https://anaconda.org/. It seems that genEra is not available in conda yet.

The text was updated successfully, but these errors were encountered:

josuebarrera · 2024-10-31T12:28:04Z

Dear @wangzhenzZ ,

You are right, the conda package doesn't seem to be released to the public yet. I deeply apologize for this mistake, we'll try to fix this as soon as possible!

Best regards,
@josuebarrera

wangzhenzZ · 2024-11-08T07:23:56Z

I have another question, and I don't think it's worth opening another issue.
genEra -q [query_sequences.fasta] -t [query_taxid] -b [path/to/nr] -d [path/to/taxdump]
How long should this step usually take? I submitted it in docker 8 days ago, but still no results.
The log file shows Searching for homologs against the DIAMOND database.

josuebarrera · 2024-11-08T12:57:36Z

Dear @wangzhenzZ ,

The first step of GenEra usually takes less than a day to run, but this is dependent on the number of CPUs that are allocated to run the analysis. Maybe you are running GenEra on a single CPU?
Could you also please verify that the pipeline is properly writing the output file of the first step? You should find it inside the temporary directory (the location of this directory is specified in the STDOUT). If the file is not getting bigger every few minutes, there could be a problem with the writing permissions or with the storage space of your hard drive.

Best regards,
Josué

wangzhenzZ · 2024-11-09T16:09:43Z

Dear @josuebarrera

Thank you for your prompt response.
I used the default parameters. I saw in the wiki that By default, GenEra uses all the available threads in the system, so I didn't specify the number of CPUs.
I did get some result files in the temporary directorytmp_9823_2763: 9823_Diamond_results.bout(418M), tmp_9823.abc(337M), 9823_Diamond_prefiltered_results.bout(0).
But that's all, there is no additional output or log file so far.

josuebarrera · 2024-11-10T14:34:11Z

Dear @wangzhenzZ

The file 9823_Diamond_prefiltered_results.bout should be increasing in size, but it is empty. The error could be associated with the setup of the DIAMOND database. Would you be kind enough to send me the STDOUT of your run? I'd like to see if DIAMOND threw an error.

Best,
Josué

wangzhenzZ · 2024-11-10T14:52:20Z

Dear @josuebarrera

There is my command and stdout of Setting up the database

diamond makedb \
 --in ./nr_db/nr \
 --db ./nr_db/nr \
 --taxonmap ./accession2taxid/prot.accession2taxid \
 --taxonnodes taxdump/nodes.dmp \
 --taxonnames taxdump/names.dmp

mkdb_out.log

And I got the filenr.dmnd (349G).

No error was displayed...

josuebarrera · 2024-11-10T16:22:07Z

Dear @wangzhenzZ

Thank you for your quick reply. I meant if you could please share with me the log file of the GenEra analysis to see if there is any error. That will help me find any potential issues in the GenEra code.

But I appreciate that you sent me the log for DIAMOND makedb. The database looks perfectly fine, which narrows down my search for the issue.

Given the amount of CPUs that you are using, the entire GenEra analysis should take less than a day for most eukaryotic proteomes (20,000 to 30,000 proteins). May I ask how many protein sequences reside within your FASTA query file?

Best,
Josué

wangzhenzZ · 2024-11-10T17:13:17Z

Dear @josuebarrera,

There is the log file of GenEra. About 60,000 protein sequences in my query file.

genEra_out.log

josuebarrera · 2024-11-11T12:23:04Z

Dear @wangzhenzZ,

I am very confused about what is causing your problem. The software is not displaying any errors, which makes me think there might be an issue with Docker or your computer cluster that keeps the software stuck on the first step. Would you mind sharing your input FASTA file with me so I can attempt to emulate your error in our computing cluster? I will gladly send you the output files if the analysis runs correctly.

Best regards,
Josué

wangzhenzZ · 2024-11-11T13:45:45Z

Dear @josuebarrera,

Of course. Unfortunately, the GitHub issue interface doesn't support file uploads over 25MB. Could you please provide an alternative way, such as an email address, so I can share my input FASTA file with you? Thank you for your assistance!

Best,
WangZhen

josuebarrera · 2024-11-11T15:46:42Z

Dear @wangzhenzZ ,

Feel free to send me your FASTA sequences to my email address:

[email protected]

Best,
Josué

wangzhenzZ · 2024-11-12T06:51:41Z

Dear Josué,

I have contacted you via email and shared the my input FASTA file, please check.

Best,
WangZhen

josuebarrera · 2024-11-14T15:53:39Z

Dear @wangzhenzZ,

After running GenEra with your dataset, I confirmed that the pipeline is working correctly. Your dataset contains over 60k protein sequences, while your organism of interest has around 20k protein-coding genes. I suspect there is a large degree of sequence redundancy in your protein dataset (probably due to the retention of all the alternative spliced variants for each gene) that is greatly increasing the computing time of DIAMOND. In my personal experience, analyzing only the largest isoform per gene gives accurate age estimations for most of the genes in the genome, while keeping all the isoforms of each gene adds little value to the analysis. I'm running GenEra with the argument -y fast to see if we can obtain results from your entire dataset within a reasonable timeframe. Otherwise, I would suggest you choose the longest isoform of each gene in your species of interest and re-run GenEra with the reduced dataset.

I'll keep you posted on the results!

Best,
Josué

wangzhenzZ · 2024-11-17T11:19:34Z

Dear Josué,

Please forgive my delayed response. I have received your email and the attached files, and I truly appreciate your time and effort in running the analysis with both parameter settings. I will carefully review the results you provided and thoroughly check my computer cluster to identify any potential problems on my end. Lastly, thank you for developing such an outstanding tool and for your generous support.

Best,
WangZhen

AnupamGautam · 2024-11-25T22:44:33Z

Dear @wangzhenzZ,

You can try installing GenEra by conda, and let us know if it work correctly for you.

https://anaconda.org/bioconda/genera

Best regards,
Anupam

wangzhenzZ · 2024-11-26T14:12:31Z

Dear Anupam,

I installed GenEra by conda. It is working correctly for me. Thanks for your work!

Best,
WangZhen

josuebarrera · 2024-11-26T14:17:20Z

I'll close this thread now that both issues have been resolved. I'll modify the wiki to add the instructions for the conda installation.
Thank you, @AnupamGautam, for generating the conda recipe!

josuebarrera self-assigned this Nov 10, 2024

josuebarrera closed this as completed Nov 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conda install error #27

Conda install error #27

wangzhenzZ commented Oct 31, 2024 •

edited

Loading

josuebarrera commented Oct 31, 2024

wangzhenzZ commented Nov 8, 2024 •

edited

Loading

josuebarrera commented Nov 8, 2024

wangzhenzZ commented Nov 9, 2024

josuebarrera commented Nov 10, 2024

wangzhenzZ commented Nov 10, 2024 •

edited

Loading

josuebarrera commented Nov 10, 2024

wangzhenzZ commented Nov 10, 2024

josuebarrera commented Nov 11, 2024

wangzhenzZ commented Nov 11, 2024

josuebarrera commented Nov 11, 2024

wangzhenzZ commented Nov 12, 2024

josuebarrera commented Nov 14, 2024

wangzhenzZ commented Nov 17, 2024

AnupamGautam commented Nov 25, 2024 •

edited

Loading

wangzhenzZ commented Nov 26, 2024

josuebarrera commented Nov 26, 2024

Conda install error #27

Conda install error #27

Comments

wangzhenzZ commented Oct 31, 2024 • edited Loading

josuebarrera commented Oct 31, 2024

wangzhenzZ commented Nov 8, 2024 • edited Loading

josuebarrera commented Nov 8, 2024

wangzhenzZ commented Nov 9, 2024

josuebarrera commented Nov 10, 2024

wangzhenzZ commented Nov 10, 2024 • edited Loading

josuebarrera commented Nov 10, 2024

wangzhenzZ commented Nov 10, 2024

josuebarrera commented Nov 11, 2024

wangzhenzZ commented Nov 11, 2024

josuebarrera commented Nov 11, 2024

wangzhenzZ commented Nov 12, 2024

josuebarrera commented Nov 14, 2024

wangzhenzZ commented Nov 17, 2024

AnupamGautam commented Nov 25, 2024 • edited Loading

wangzhenzZ commented Nov 26, 2024

josuebarrera commented Nov 26, 2024

wangzhenzZ commented Oct 31, 2024 •

edited

Loading

wangzhenzZ commented Nov 8, 2024 •

edited

Loading

wangzhenzZ commented Nov 10, 2024 •

edited

Loading

AnupamGautam commented Nov 25, 2024 •

edited

Loading