fix: Properly set ARN in namespace for Iceberg Glue symlinks #2943
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
For Iceberg tables configured in Glue catalog, the symlinks have paths instead of ARNs in the
namespace
field. For example:What we want instead is:
The problem is in
io.openlineage.spark3.agent.lifecycle.plan.catalog.IcebergHandler#getDatasetIdentifier
method which doesn't supportglue
catalog type.Solution
The
IcebergHandler
should supportglue
catalog table and create the symlink using the code fromPathUtils
.If you're contributing a new integration, please specify the scope of the integration and how/where it has been tested (e.g., Apache Spark integration supports
S3
andGCS
filesystem operations, tested with AWS EMR).One-line summary:
Checklist
SPDX-License-Identifier: Apache-2.0
Copyright 2018-2024 contributors to the OpenLineage project