-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: Add integration tests for EMR #3142
test: Add integration tests for EMR #3142
Conversation
95faa83
to
0ccd0a0
Compare
0ccd0a0
to
b567a55
Compare
@arturowczarek can we run "regular" integration tests on EMR, or only special tests otherwise located in the |
@mobuchowski The loading class locates the files in the specific directory for simplicity. We can extend it if necessary, but passing just the name of the file is convenient, because the name is used to create temporary directory where everything for this test is kept. Otherwise we could have multiple scripts with the same name and they would keep the events under the same location. Alternatively we could have a better strategy for determining the location for the events. |
Makes sense, I just think we'd also potentially benefit with running a bunch of "regular" integration tests on EMR regularly. |
Yes, it is one of the methods we could further develop these tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is huge step forward in improving OL event collection on AWS.
The code looks good to me.
What system properties are required to be set to run the test?
The parameters are listed in |
b567a55
to
0c59d58
Compare
* Spark integration has integration tests for EMR Signed-off-by: Artur Owczarek <[email protected]>
0c59d58
to
bc7befc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great test. Thank you @arturowczarek for live demo.
Problem
We are missing the integration tests for EMR. Without them some types of bugs remain undetected and we may introduce regressions without knowing it.
Solution
We should have integration tests. We don't have a sponsor for the AWS infrastructure, so every developer should be able to run them manually using their infrastructure.
For reviewers: Start reading from
EmrIntegrationTest
class.If you're contributing a new integration, please specify the scope of the integration and how/where it has been tested (e.g., Apache Spark integration supports
S3
andGCS
filesystem operations, tested with AWS EMR).One-line summary:
Checklist
SPDX-License-Identifier: Apache-2.0
Copyright 2018-2024 contributors to the OpenLineage project