Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rela.cpython-37m-x86_64-linux-gnu.so: undefined symbol #36

Closed
haopan27 opened this issue Aug 15, 2023 · 1 comment
Closed

rela.cpython-37m-x86_64-linux-gnu.so: undefined symbol #36

haopan27 opened this issue Aug 15, 2023 · 1 comment

Comments

@haopan27
Copy link

As I attempted to start training using the command provided in the readme:

python run.py --adhoc --cfg conf/c02_selfplay/liars_sp.yaml \
    env.num_dice=1 \
    env.num_faces=4 \
    env.subgame_params.use_cfr=true \
    selfplay.cpu_gen_threads=0  \
    selfplay.threads_per_gpu=8

I encountered the following error:

Traceback (most recent call last):
File "run.py", line 25, in
import cfvpy.tasks
File "/home/myname/rebel-main/cfvpy/tasks.py", line 15, in
import cfvpy.selfplay
File "/home/myname/rebel-main/cfvpy/selfplay.py", line 27, in
import cfvpy.rela
ImportError: /home/myname/rebel-main/cfvpy/rela.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN8pybind116detail11type_casterIN2at6TensorEvE4loadENS_6handleEb

I am not sure what really went wrong. My guess is this has something to do with the make process. I appreciate it if anyone could shed some light on this issue.

Additional info:
When I first attempted make, the compiler was complaining about not being able to find py::module_ in /opt/conda/envs/rebel/lib/python3.7/site-packages/torch/include/torch/csrc/utils/pybind.h (line 176)

And according to pybind 11 documentation,

py::module has been renamed py::module_, but a backward compatible typedef has been included

So I changed py::module_::import to py::module::import. I completed make without any error, but perhaps this is not the right thing to do.

@haopan27
Copy link
Author

I described something similar here

Inspired by the subsequent comment by rwgk, I realized that my torch version might be too high (1.13.1 instead of 1.4.0 as in requirements.txt). Downgraded it and the error is gone. Closing this issue now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant