Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError: too many indices for array: array is 0-dimensional, but 1 were indexed #15

Closed
chenshixinnb opened this issue Jan 15, 2022 · 16 comments

Comments

@chenshixinnb
Copy link

After the first CPU run,command:./run_alphafold.sh -d $DATA_DIR -o $OUTPUT_DIR -p multimer -i $INPUT_DIR/test.fasta -t 2021-11-01 -m model_1 -f

@chenshixinnb
Copy link
Author

The second step is to use the GPU,Error occurred,command:./run_alphafold.sh -d $DATA_DIR -o $OUTPUT_DIR -p multimer -m model_1,model_2,model_3,model_4,model_5 -i $INPUT_DIR/test.fasta -t 2021-11-01;

@chenshixinnb
Copy link
Author

2022-01-14 15:50:56.884239: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
I0114 15:51:34.441559 47327477676160 templates.py:857] Using precomputed obsolete pdbs /public/software/.local/easybuild/software/alphafold/data2/pdb_mmcif/obsolete.dat.
I0114 15:51:35.627506 47327477676160 xla_bridge.py:243] Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker:
I0114 15:51:35.627812 47327477676160 xla_bridge.py:243] Unable to initialize backend 'gpu': NOT_FOUND: Could not find registered platform with name: "cuda". Available platform names are: Interpreter Host
I0114 15:51:35.628199 47327477676160 xla_bridge.py:243] Unable to initialize backend 'tpu': INVALID_ARGUMENT: TpuPlatform is not available.
W0114 15:51:35.628336 47327477676160 xla_bridge.py:248] No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
I0114 15:51:36.549557 47327477676160 run_alphafold.py:407] Have 1 models: ['model_1']
I0114 15:51:36.549774 47327477676160 run_alphafold.py:423] Using random seed 6020071004121300369 for the data pipeline
I0114 15:51:36.549989 47327477676160 run_alphafold.py:156] Predicting fuheti
I0114 15:51:36.700862 47327477676160 run_alphafold.py:202] Running model model_1 on fuheti
Traceback (most recent call last):
File "/public/software/.local/easybuild/software/ParallelFold/ParallelFold/run_alphafold.py", line 455, in
app.run(main)
File "/public/software/.local/easybuild/software/Anaconda3/2020.02/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/public/software/.local/easybuild/software/Anaconda3/2020.02/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/public/software/.local/easybuild/software/ParallelFold/ParallelFold/run_alphafold.py", line 429, in main
predict_structure(
File "/public/software/.local/easybuild/software/ParallelFold/ParallelFold/run_alphafold.py", line 205, in predict_structure
processed_feature_dict = model_runner.process_features(
File "/public/software/.local/easybuild/software/ParallelFold/ParallelFold/alphafold/model/model.py", line 131, in process_features
return features.np_example_to_features(
File "/public/software/.local/easybuild/software/ParallelFold/ParallelFold/alphafold/model/features.py", line 83, in np_example_to_features
num_res = int(np_example['seq_length'][0])
IndexError: too many indices for array: array is 0-dimensional, but 1 were indexed

@Zuricho
Copy link
Owner

Zuricho commented Jan 15, 2022

We are working on this problem. I will send another reply when we have any updataes

@Zuricho
Copy link
Owner

Zuricho commented Jan 15, 2022

Could you send me you input fasta file and let me check it? My email is [email protected]

@chenshixinnb
Copy link
Author

It has been sent, thank you

@Zuricho
Copy link
Owner

Zuricho commented Jan 15, 2022

OK I understand, you may try use -m model_1_multimer, not -m model_1 in GPU part

@chenshixinnb
Copy link
Author

I use command:$PROGRAM_DIR/run_alphafold.sh -d $DATA_DIR -o $OUTPUT_DIR -p multimer -m model_1_multimer -i $INPUT_DIR/test2.fasta -t 2021-11-01;now the run is stuck here all the time:

@chenshixinnb
Copy link
Author

2022-01-15 13:35:27.686400: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
I0115 13:36:03.102030 47367004878976 templates.py:857] Using precomputed obsolete pdbs /public/software/.local/easybuild/software/alphafold/data2/pdb_mmcif/obsolete.dat.
I0115 13:36:04.031663 47367004878976 xla_bridge.py:243] Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker:
I0115 13:36:04.031914 47367004878976 xla_bridge.py:243] Unable to initialize backend 'gpu': NOT_FOUND: Could not find registered platform with name: "cuda". Available platform names are: Interpreter Host
I0115 13:36:04.032272 47367004878976 xla_bridge.py:243] Unable to initialize backend 'tpu': INVALID_ARGUMENT: TpuPlatform is not available.
W0115 13:36:04.032407 47367004878976 xla_bridge.py:248] No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
I0115 13:36:04.889354 47367004878976 run_alphafold.py:407] Have 1 models: ['model_1_multimer']
I0115 13:36:04.889528 47367004878976 run_alphafold.py:423] Using random seed 9133243819396004162 for the data pipeline
I0115 13:36:04.889718 47367004878976 run_alphafold.py:156] Predicting test2
I0115 13:36:05.007085 47367004878976 run_alphafold.py:202] Running model model_1_multimer on test2
I0115 13:36:05.007616 47367004878976 model.py:165] Running predict with shape(feat) = {'aatype': (646,), 'residue_index': (646,), 'seq_length': (), 'msa': (3101, 646), 'num_alignments': (), 'template_aatype': (4, 646), 'template_all_atom_mask': (4, 646, 37), 'template_all_atom_positions': (4, 646, 37, 3), 'asym_id': (646,), 'sym_id': (646,), 'entity_id': (646,), 'deletion_matrix': (3101, 646), 'deletion_mean': (646,), 'all_atom_mask': (646, 37), 'all_atom_positions': (646, 37, 3), 'assembly_num_chains': (), 'entity_mask': (646,), 'num_templates': (), 'cluster_bias_mask': (3101,), 'bert_mask': (3101, 646), 'seq_mask': (646,), 'msa_mask': (3101, 646)}
2022-01-15 13:39:56.909318: E external/org_tensorflow/tensorflow/compiler/xla/service/slow_operation_alarm.cc:55]


Very slow compile? If you want to file a bug, run with envvar XLA_FLAGS=--xla_dump_to=/tmp/foo and attach the results.
Compiling module jit_apply_fn.96373


@Zuricho
Copy link
Owner

Zuricho commented Jan 15, 2022

It's strange that your JAX is using CPU rather than GPU, did you well prepared the environment? Like did you install your CUDA toolkit and load your local CUDA environment

@chenshixinnb
Copy link
Author

Thanks,How do I make sure JAX is A GPU version?I confirm that the run loaded the environment,previous predictions of monomer structures were also successful.

@Zuricho
Copy link
Owner

Zuricho commented Jan 17, 2022

Yeah, it's really strange that your monomer is working and multimer models are not. Usually if your GPU cannot be detect, both monomer and multimer are not using GPU

@Zuricho
Copy link
Owner

Zuricho commented Jan 17, 2022

To see if your program detect your GPU, you can use this:

python
>>> import tensorflow as tf; print(tf.config.list_physical_devices("GPU"))
>>> import jax; print(jax.devices())

@chenshixinnb
Copy link
Author

chenshixinnb commented Jan 17, 2022

To see if your program detect your GPU, you can use this:

python
>>> import tensorflow as tf; print(tf.config.list_physical_devices("GPU"))
>>> import jax; print(jax.devices())
>>> import tensorflow as tf; print(tf.config.list_physical_devices("GPU"))
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
>>> import jax; print(jax.devices())
WARNING:absl:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
[<jaxlib.xla_extension.Device object at 0x2ba093904930>]

@Zuricho
Copy link
Owner

Zuricho commented Jan 22, 2022

So you indeen need to check your JAX version to ensure that your model can use GPU, maybe reinstall the environment?

@chenshixinnb
Copy link
Author

OK,thanks

@WishIWasBornInTheCreteaceousEra

Hi Zuricho,

Unfortunately the issue still persists as the proper model name is now model_1_multimer_v3 for the GPU part. I figured I'd leave this comment for posterity sake.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants