Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for ISO-2022-JP in Multipart Requests #2245

Open
nappa opened this issue Sep 2, 2024 · 2 comments
Open

Support for ISO-2022-JP in Multipart Requests #2245

nappa opened this issue Sep 2, 2024 · 2 comments

Comments

@nappa
Copy link

nappa commented Sep 2, 2024

There is an issue where an Encoding::CompatibilityError is raised when the request body is multipart and the Content-Type is a text format (e.g., text/html or text/plain) with the charset set to ISO-2022-JP. ISO-2022-JP is currently treated as a dummy encoding in Ruby, meaning it cannot be interpreted or processed by Ruby.

Other dummy encodings include UTF-7 and IBM037, which are rarely used today. However, ISO-2022-JP, though considered a legacy encoding, is still used in emails in Japan due to historical reasons. As a result, some services that relay emails over HTTP may still use ISO-2022-JP.

For example, SendGrid, a service that receives emails via an SMTP Gateway, represents these emails as multipart data and sends them to webhooks via HTTP POST. In this process, the email body, encoded in ISO-2022-JP, is sent as a part of the request in its original encoding.

In our environment, we use Rack to handle requests from SendGrid, but we are encountering an issue where these requests cannot be processed due to the ISO-2022-JP encoding.

@matthewd
Copy link
Contributor

matthewd commented Sep 4, 2024

For comparison / context, which place(s) raise the Encoding::CompatibilityError?

Re-encoding seems like it might be reasonable, but the other option might be to fix any Rack internals to allow dummy-encoded values to get to the application, and allow the app to decide what to do. 🤔

@nappa
Copy link
Author

nappa commented Sep 6, 2024

To provide some context, the Encoding::CompatibilityError we're encountering occurs at the following location in our environment:

name =~ %r(\A[\[\]]*([^\[\]] )\]*)

(We are using rack-2.2 in our production environment, so this is the location where the error occurs. Although query_parser.rb was significantly modified in rack-3.0, the same issue persists there.)

I had a similar thought about passing dummy-encoded values directly to the application, but unfortunately, it wasn't feasible. Here's why:

Since ISO-2022-JP is a dummy encoding in Ruby, it doesn't support character-level splitting. To handle it within Rack, we'd need to either re-encode the data to something like UTF-8 (this is my approach) or implement custom handling to parse through ISO-2022-JP's state-transition-based encoding to detect multipart boundary markers. The latter approach would require not only changes in query_parser.rb but likely many other parts of the codebase as well, due to its complexity.

While I agree that allowing the application to handle ISO-2022-JP would be ideal, the complexity of the code required to achieve this doesn't seem to justify the benefits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants