Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use TensorRT in trained model #45

Open
HUXING8 opened this issue Jul 5, 2024 · 10 comments
Open

How to use TensorRT in trained model #45

HUXING8 opened this issue Jul 5, 2024 · 10 comments

Comments

@HUXING8
Copy link

HUXING8 commented Jul 5, 2024

I am going to use TensorRT to accelerate my inference step.
For many issues, like the input data is a dict, it cannot be converted to ONNX.

@SunHaoOne
Copy link

I am going to use TensorRT to accelerate my inference step. For many issues, like the input data is a dict, it cannot be converted to ONNX.

For ONNX, dictionary inputs are not allowed. If there is a function like this:

def forward(self, data):
    x1 = data['label1']
    x2 = data['label2']
    return x1, x2
dummy_input = data

Then converting it to the following format would be suitable:

def forward(self, x1, x2):
    return x1, x2
dummy_input = data['label1'], data['label2']

@HUXING8
Copy link
Author

HUXING8 commented Jul 8, 2024

qcnet
In this pic, qcnet's input data is a nested dict, which means i need to flatten it to a all tensor parameter list according to your method.

def forward(self, data):
    scene_enc = self.encoder(data)
    pred = self.decoder(data, scene_enc)
    return pred

Editing the function:

def forward(self, x1, x2, x3, x4, x5):
    # Each para of x1,x2,x3,x4,x5 is a tensor in origin dict
    scene_enc = self.encoder(x1,x2,x3,x4,x5)
    pred = self.decoder(x1,x2,x3,x4,x5)
    return pred

Moreover, my model has been trained in original codes from author ZHOU. Is it necessary to rebuild the structures of network, to format dict to tensor in every layers, so that it is suitable to recieve pure tensor data.

@SunHaoOne
Copy link

SunHaoOne commented Jul 8, 2024

Is it necessary to rebuild the structures of network, to format dict to tensor in every layers, so that it is suitable to recieve pure tensor data.

You're right. You need to change the model's input and other code using dictionaries to ensure the model uses pure tensors to forward.

@HUXING8
Copy link
Author

HUXING8 commented Jul 8, 2024

Is it necessary to rebuild the structures of network, to format dict to tensor in every layers, so that it is suitable to recieve pure tensor data.

You're right. You need to change the model's input and other code using dictionaries to ensure the model uses pure tensors to forward.

Thanks for your reply. I will try to deal with it.
By the way, have you ever processed this job, and what's the result like? I am looking foward to your experience.

@SunHaoOne
Copy link

By the way, have you ever processed this job, and what's the result like? I am looking foward to your experience.

In my computer, the average inference time is approximately 10ms per scenario.

@xiaowuge1201
Copy link

How operator operations in torch_geometric are converted to onn??

@SunHaoOne
Copy link

How operator operations in torch_geometric are converted to onn??

PyG and ONNX don't work very well together, especially with functions like torch scatter_add. However, the author has shared the excellent embedding codes that you can use to rewrite.

@yuanryann
Copy link

Hi @SunHaoOne have you produced the ONNX, Could you plz share some idea on the TensorRT process?

@xiaowuge1201
Copy link

How operator operations in torch_geometric are converted to onn??

PyG and ONNX don't work very well together, especially with functions like torch scatter_add. However, the author has shared the excellent embedding codes that you can use to rewrite.

I encountered some issues while rewriting this code:
The input of a graph neural network is graph structured data, which is sparse data. When I want to replace PYG operations, I can use dense computing instead. However, dense computing increases the amount of data and reduces computational efficiency, which is not a solution

@Crown798
Copy link

Crown798 commented Aug 2, 2024

@xiaowuge1201 hi bro, have you find a solution? i got the same problem for dense computing. The inference speed is so damn slow with onnx-cpu version, and my gpu memory is not enough to run onnx-gpu version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants