-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create guide for Machine Learning Engine operators #8207 #8968
Closed
U-Ozdemir
wants to merge
155
commits into
dependabot/npm_and_yarn/airflow/www/jquery-3.5.0
from
master
Closed
Create guide for Machine Learning Engine operators #8207 #8968
U-Ozdemir
wants to merge
155
commits into
dependabot/npm_and_yarn/airflow/www/jquery-3.5.0
from
master
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…8625) PostgresHook's parent class, DbApiHook, implements upsert in its insert_rows() method with the replace=True flag. However, the underlying generated SQL is specific to MySQL's "REPLACE INTO" syntax and is not applicable to PostgreSQL. This pulls out the sql generation code for insert/upsert out in to a method that is then overridden in the PostgreSQL subclass to generate the "INSERT ... ON CONFLICT DO UPDATE" syntax ("new" since Postgres 9.5)
Right now requirements will be only checked during the CI build if the setup.py has changed and if yes, clear instructions will be given. The diff will still be printed otherwise but it will not cause the job to fail
Their response format is like {"example_dag_id": [{"state": "success", "dag_id": "example_dag_id"}, ...], ...} The dag_id is already used as the "key", but still repeatedly appear in each element, which makes the response payload size unnecessarily bigger
Allow EmrCreateJobFlowOperator and EmrAddStepsOperator to receive their 'job_flow_overrides', and 'steps' arguments respectively as Jinja template filenames. This is similar to BashOperator's capability of receiving a filename as its 'bash_command' argument.
Changes deprecated config check rules. Now uses regex to look for an old pattern in the val. Updates 'hostname_callable'. This lets us pull the change back in to 1.10.x, so that by the time 2.0 is around people will have had time and notice to update, without reading (the now quite long) UPDATING.md. Depends on #8463
When using KubernetesExecutor without any centralized PV for log storage, one has to wait until the logs get uploaded to cloud storage before viewing them on UI. With this change, the webserver will try to fetch logs from running worker pods and display them.
connection add/edit UI pages were not working correctly for Spark connections. The root-cause is that "spark" is not listed in models.Connection._types. So when www/forms.py tries to produce the UI, "spark" is not available and it always tried to "fall back" to the option list whose first entry is "Docker" In addition, we should hide irrelevant entries for spark connections ("schema", "login", and "password")
Currently the connection type list in the UI is sorted in the original order of `Connection._types`, which may be a bit inconvenient for users. It would be better if it can be sorted alphabetically.
* Remove config side effects * Fix LatestOnlyOperator return type to be json serializable * Fix tests/test_configuration.py * Fix tests/executors/test_dask_executor.py * Fix tests/jobs/test_scheduler_job.py * Fix tests/models/test_cleartasks.py * Fix tests/models/test_taskinstance.py * Fix tests/models/test_xcom.py * Fix tests/security/test_kerberos.py * Fix tests/test_configuration.py * Fix tests/test_logging_config.py * Fix tests/utils/test_dag_processing.py * Apply isort * Fix tests/utils/test_email.py * Fix tests/utils/test_task_handler_with_custom_formatter.py * Fix tests/www/api/experimental/test_kerberos_endpoints.py * Fix tests/www/test_views.py * Code refactor * Fix tests/www/api/experimental/test_kerberos_endpoints.py * Fix requirements * fixup! Fix tests/www/test_views.py
Co-authored-by: Ace Haidrey <[email protected]>
Co-authored-by: James Timmins <[email protected]>
Co-authored-by: michalslowikowski00 <[email protected]>
…8910) Currently there is no way to determine the state of a TaskInstance in the graph view or tree view for people with colour blindness Approximately 4.5% of people experience some form of colour vision deficiency
The singularity operator tests _have always_ used mocking, so we were adding 700MB to our docker image for nothing. Fixes #8774
CSRF_ENABLED does nothing. Thankfully, due to sensible defaults in flask-wtf, CSRF is on by default, but we should set this correctly. Fixes #8915
All PRs will used cached "latest good" version of the python base images from our GitHub registry. The python versions in the Github Registry will only get updated after a master build (which pulls latest Python image from DockerHub) builds and passes test correctly. This is to avoid problems that we had recently with Python patchlevel releases breaking our Docker builds.
Slight "improvement" on #8949
…c dag (#8952) The scheduler_dag_execution_timing script wants to run _n_ dag runs to completion. However since the start date of those dags is Dynamic (`now - delta`) we can't pre-compute the execution_dates like we were before. (This is because the execution_date of the very first dag run would be `now()` of the parser process, but if we try to pre-compute that in the benchmark process it would see a different value of now().) This PR changes it to instead watch for the first _n_ dag runs to be completed. This should make it work with more dags with less changes to them.
* Push CI images to Docker packcage cache for v1-10 branches This is done as a commit to master so that we can keep the two branches in sync Co-Authored-By: Ash Berlin-Taylor <[email protected]> * Run Github Actions against v1-10-stable too Co-authored-by: Ash Berlin-Taylor <[email protected]>
Old Repo: https://github.com/Azure/azure-cosmos-python New Repo: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/cosmos/azure-cosmos azure-cosmos==4.0.0 was released on 20 May 2020 that breaks Airflow
`field_path` was renamed to `tag_template_field_path` in >=0.8 and there might be other unknown errors
* [AIRFLOW-5262] Update timeout exception to include dag * PR comment: extract dag id in log to variable
boring-cyborg
bot
added
area:CLI
area:dev-tools
area:Scheduler
including HA (high availability) scheduler
labels
May 22, 2020
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
|
Co-authored-by: Ace Haidrey <[email protected]>
Fixes AIRFLOW-6569 by explicitly flushing pending exceptions prior to calling `os._exit` within the forked task runner.
Could you fix source & target branch? |
@U-Ozdemir Can you create a new PR against Master please |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This my first time working with an open source project and I posting here the first attempt for an ML operator guide. It is still in progress now, but some feedback is always welcome.
Guide for ML operator (Work in progress)
AI Platform:
The AI Platform is used to train your machine learning models at scale, to host your trained model in the cloud, and to use your model to make predictions about new data.
Machine learning (ML) is a subfield of artificial intelligence (AI). The goal of ML is to make computers learn from the data that you give them. Instead of writing code that describes the action the computer should take, your code provides an algorithm that adapts based on examples of intended behavior. The resulting program, consisting of the algorithm and associated learned parameters, is called a trained model.
Prerequisite Tasks
To use these operators, you must do a few things:
Detailed information is available Installation
Service description:
AI Platform:
The AI Platform is used to train your machine learning models at scale, to host your trained model in the cloud, and to use your model to make predictions about new data.
Machine learning (ML) is a subfield of artificial intelligence (AI). The goal of ML is to make computers learn from the data that you give them. Instead of writing code that describes the action the computer should take, your code provides an algorithm that adapts based on examples of intended behavior. The resulting program, consisting of the algorithm and associated learned parameters, is called a trained model.
Operators
MLEngineManageModelOperator
MLEngineCreateVersionOperator
MLEngineDeleteVersionOperator
MLEngineDeleteModelOperator
Make sure to mark the boxes below before creating PR: [x]
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.