Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration variable empty #86

Open
gfelot opened this issue Aug 17, 2020 · 5 comments
Open

Configuration variable empty #86

gfelot opened this issue Aug 17, 2020 · 5 comments

Comments

@gfelot
Copy link

gfelot commented Aug 17, 2020

I followed the readme and create variable and adjust some in the code.

At the end I got :

[2020-08-17 15:56:56,257] {bash_operator.py:126} INFO - Configurations:
[2020-08-17 15:56:56,257] {bash_operator.py:126} INFO - BASE_LOG_FOLDER:      ''
[2020-08-17 15:56:56,257] {bash_operator.py:126} INFO - MAX_LOG_AGE_IN_DAYS:  ''
[2020-08-17 15:56:56,257] {bash_operator.py:126} INFO - ENABLE_DELETE:        ''

Where I'm expected :

[2020-08-17 15:56:56,257] {bash_operator.py:126} INFO - Configurations:
[2020-08-17 15:56:56,257] {bash_operator.py:126} INFO - BASE_LOG_FOLDER:      ‘/data/airflow/logs'
[2020-08-17 15:56:56,257] {bash_operator.py:126} INFO - MAX_LOG_AGE_IN_DAYS:  ‘30'
[2020-08-17 15:56:56,257] {bash_operator.py:126} INFO - ENABLE_DELETE:        ’true'

Also the script doesn't run since I got the "error" :

INFO - Another task is already deleting logs on this worker node.     Skipping it!
INFO - If you believe you're receiving this message in error, kindly check     if /tmp/airflow_log_cleanup_worker.lock exists and delete it.
INFO - Command exited with return code 0
INFO - Marking task as SUCCESS.dag_id=clean_airflow_logs, task_id=log_cleanup_worker_num_1_dir_0, execution_date=20200817T130009, start_date=20200817T130027, end_date=20200817T130027

Also in this case the script should exit 1 since there was an issue. Here my dag and task say that everything run fine.

@prakshalj0512
Copy link
Contributor

What version of Airflow are you using?

@gfelot
Copy link
Author

gfelot commented Apr 8, 2021

Hi @prakshalj0512 Sorry for the delay it wasn't on top of my list to fix that lately. Now yes...

I opened different issue but finally all related to this one.
Every variable are empty.

If I echo ENABLE_DELETE I got nothing (even if I set the Variable in AF AND having it hard coded like in the original DAG). And exactly the same for every variable.
Also the return of the successful creation of the lock file or any other return value return nothing. Then on every statement the code jump to the else statement even if the lock file was created.

I removed this if/else part and it failed at the removed part even if the file was removed correctly. So there is an issue here on the script.

My airflow version is 1.10.12.

@prakshalj0512
Copy link
Contributor

Hi,
Could you try downloading the latest version of airflow-log-cleanup.py? I just tried running it locally and it seems to be working fine.
Below are the logs from the run. I haven't made any changes to the file. It's pulling the BASE_LOG_FOLDER location from airflow conf. Also, before starting, make sure to delete /tmp/airflow_log_cleanup_worker.lock if it exists.

[2021-04-08 15:46:16,184] {bash_operator.py:153} INFO - Output:
[2021-04-08 15:46:16,189] {bash_operator.py:157} INFO - Getting Configurations...
[2021-04-08 15:46:19,199] {bash_operator.py:157} INFO - maxLogAgeInDays conf variable isn't included. Using Default '30'.
[2021-04-08 15:46:19,200] {bash_operator.py:157} INFO - Finished Getting Configurations
[2021-04-08 15:46:19,200] {bash_operator.py:157} INFO - 
[2021-04-08 15:46:19,200] {bash_operator.py:157} INFO - Configurations:
[2021-04-08 15:46:19,200] {bash_operator.py:157} INFO - BASE_LOG_FOLDER:      '/Users/prakshaljain/airflow/logs'
[2021-04-08 15:46:19,200] {bash_operator.py:157} INFO - MAX_LOG_AGE_IN_DAYS:  '30'
[2021-04-08 15:46:19,201] {bash_operator.py:157} INFO - ENABLE_DELETE:        'true'
[2021-04-08 15:46:19,201] {bash_operator.py:157} INFO - Lock file not found on this node!     Creating it to prevent collisions...
[2021-04-08 15:46:19,210] {bash_operator.py:157} INFO - 
[2021-04-08 15:46:19,210] {bash_operator.py:157} INFO - Running Cleanup Process...
[2021-04-08 15:46:19,210] {bash_operator.py:157} INFO - Executing Find Statement: find /Users/prakshaljain/airflow/logs/*/* -type f -mtime       30
[2021-04-08 15:46:19,240] {bash_operator.py:157} INFO - Process will be Deleting the following File(s)/Directory(s):
[2021-04-08 15:46:19,241] {bash_operator.py:157} INFO - 
[2021-04-08 15:46:19,251] {bash_operator.py:157} INFO - Process will be Deleting        0 File(s)/Directory(s)
[2021-04-08 15:46:19,251] {bash_operator.py:157} INFO - 
[2021-04-08 15:46:19,252] {bash_operator.py:157} INFO - WARN: No File(s)/Directory(s) to Delete
[2021-04-08 15:46:19,252] {bash_operator.py:157} INFO - Executing Find Statement: find /Users/prakshaljain/airflow/logs/*/* -type d -empty
[2021-04-08 15:46:19,259] {bash_operator.py:157} INFO - Process will be Deleting the following File(s)/Directory(s):
[2021-04-08 15:46:19,259] {bash_operator.py:157} INFO - /Users/prakshaljain/airflow/logs/scheduler/2021-02-22
[2021-04-08 15:46:19,259] {bash_operator.py:157} INFO - /Users/prakshaljain/airflow/logs/scheduler/2021-02-23
[2021-04-08 15:46:19,260] {bash_operator.py:157} INFO - /Users/prakshaljain/airflow/logs/scheduler/2021-02-24
[2021-04-08 15:46:19,260] {bash_operator.py:157} INFO - /Users/prakshaljain/airflow/logs/scheduler/2021-02-25
[2021-04-08 15:46:19,260] {bash_operator.py:157} INFO - /Users/prakshaljain/airflow/logs/scheduler/2021-02-26
[2021-04-08 15:46:19,261] {bash_operator.py:157} INFO - /Users/prakshaljain/airflow/logs/scheduler/2021-02-27
[2021-04-08 15:46:19,261] {bash_operator.py:157} INFO - /Users/prakshaljain/airflow/logs/scheduler/2021-02-28
[2021-04-08 15:46:19,261] {bash_operator.py:157} INFO - /Users/prakshaljain/airflow/logs/scheduler/2021-03-04
[2021-04-08 15:46:19,261] {bash_operator.py:157} INFO - /Users/prakshaljain/airflow/logs/scheduler/2021-03-05
[2021-04-08 15:46:19,262] {bash_operator.py:157} INFO - /Users/prakshaljain/airflow/logs/scheduler/2021-03-16
[2021-04-08 15:46:19,262] {bash_operator.py:157} INFO - /Users/prakshaljain/airflow/logs/scheduler/2021-03-17
[2021-04-08 15:46:19,262] {bash_operator.py:157} INFO - /Users/prakshaljain/airflow/logs/scheduler/2021-03-30
[2021-04-08 15:46:19,262] {bash_operator.py:157} INFO - /Users/prakshaljain/airflow/logs/scheduler/2021-04-06
[2021-04-08 15:46:19,263] {bash_operator.py:157} INFO - /Users/prakshaljain/airflow/logs/scheduler/2021-04-07
[2021-04-08 15:46:19,264] {bash_operator.py:157} INFO - Process will be Deleting       14 File(s)/Directory(s)
[2021-04-08 15:46:19,264] {bash_operator.py:157} INFO - 
[2021-04-08 15:46:19,264] {bash_operator.py:157} INFO - Executing Delete Statement: find /Users/prakshaljain/airflow/logs/*/* -type d -empty -prune -exec rm -rf {} \;
[2021-04-08 15:46:19,305] {bash_operator.py:157} INFO - Executing Find Statement: find /Users/prakshaljain/airflow/logs/* -type d -empty
[2021-04-08 15:46:19,310] {bash_operator.py:157} INFO - Process will be Deleting the following File(s)/Directory(s):
[2021-04-08 15:46:19,311] {bash_operator.py:157} INFO - 
[2021-04-08 15:46:19,314] {bash_operator.py:157} INFO - Process will be Deleting        0 File(s)/Directory(s)
[2021-04-08 15:46:19,315] {bash_operator.py:157} INFO - 
[2021-04-08 15:46:19,315] {bash_operator.py:157} INFO - WARN: No File(s)/Directory(s) to Delete
[2021-04-08 15:46:19,315] {bash_operator.py:157} INFO - Finished Running Cleanup Process
[2021-04-08 15:46:19,316] {bash_operator.py:157} INFO - Deleting lock file...
[2021-04-08 15:46:19,317] {bash_operator.py:159} INFO - Command exited with return code 0
[2021-04-08 15:46:19,325] {taskinstance.py:1057} INFO - Marking task as SUCCESS.dag_id=airflow_log_cleanup, task_id=log_cleanup_worker_num_1_dir_0, execution_date=20210407T000000, start_date=20210408T101616, end_date=20210408T101619
[2021-04-08 15:46:20,990] {local_task_job.py:102} INFO - Task exited with return code 0

@gfelot
Copy link
Author

gfelot commented Apr 8, 2021

I did already

I have remove the lock file before hand.

If I print the value of CREATE_LOCK_FILE_EXIT_CODE=$? after this line I got nothing. No 0 or 1 nothing.

And that's true for every variable I can extract.

*** Reading local file: /data/airflow/logs/clean_airflow_logs/log_cleanup_worker_num_1_dir_0/2021-04-07T00:00:00 00:00/22.log
[2021-04-08 11:20:27,696] {taskinstance.py:670} INFO - Dependencies all met for <TaskInstance: clean_airflow_logs.log_cleanup_worker_num_1_dir_0 2021-04-07T00:00:00 00:00 [queued]>
[2021-04-08 11:20:27,756] {taskinstance.py:670} INFO - Dependencies all met for <TaskInstance: clean_airflow_logs.log_cleanup_worker_num_1_dir_0 2021-04-07T00:00:00 00:00 [queued]>
[2021-04-08 11:20:27,756] {taskinstance.py:880} INFO - 
--------------------------------------------------------------------------------
[2021-04-08 11:20:27,756] {taskinstance.py:881} INFO - Starting attempt 22 of 23
[2021-04-08 11:20:27,757] {taskinstance.py:882} INFO - 
--------------------------------------------------------------------------------
[2021-04-08 11:20:27,777] {taskinstance.py:901} INFO - Executing <Task(BashOperator): log_cleanup_worker_num_1_dir_0> on 2021-04-07T00:00:00 00:00
[2021-04-08 11:20:27,782] {standard_task_runner.py:54} INFO - Started process 1234 to run task
[2021-04-08 11:20:27,837] {standard_task_runner.py:77} INFO - Running: ['airflow', 'run', 'clean_airflow_logs', 'log_cleanup_worker_num_1_dir_0', '2021-04-07T00:00:00 00:00', '--job_id', '26180', '--pool', 'default_pool', '--raw', '-sd', 'DAGS_FOLDER/clean_airflow_logs.py', '--cfg_path', '/tmp/tmpadn59i2f']
[2021-04-08 11:20:27,838] {standard_task_runner.py:78} INFO - Job 26180: Subtask log_cleanup_worker_num_1_dir_0
[2021-04-08 11:20:27,928] {logging_mixin.py:112} INFO - Running %s on host %s <TaskInstance: clean_airflow_logs.log_cleanup_worker_num_1_dir_0 2021-04-07T00:00:00 00:00 [running]> sea-cloud-airflow-applications.europe-west1-c.c.lisea-mesea-sandbox-272216.internal
[2021-04-08 11:20:27,997] {bash_operator.py:113} INFO - Tmp dir root location: 
 /tmp
[2021-04-08 11:20:28,007] {bash_operator.py:136} INFO - Temporary script location: /tmp/airflowtmpzu5t5bna/log_cleanup_worker_num_1_dir_0r1i1mn12
[2021-04-08 11:20:28,007] {bash_operator.py:146} INFO - Running command: 

echo "Getting Configurations..."
BASE_LOG_FOLDER="/data/airflow/logs"
WORKER_SLEEP_TIME="3"

sleep s

MAX_LOG_AGE_IN_DAYS=""
if [ "" == "" ]; then
    echo "maxLogAgeInDays conf variable isn't included. Using Default '30'."
    MAX_LOG_AGE_IN_DAYS='30'
fi
ENABLE_DELETE=true
echo "Finished Getting Configurations"
echo ""

echo "Configurations:"
echo "BASE_LOG_FOLDER:      ''"
echo "MAX_LOG_AGE_IN_DAYS:  ''"
echo "ENABLE_DELETE:        ''"

cleanup() {
    echo "Executing Find Statement: $1"
    FILES_MARKED_FOR_DELETE=`eval $1`
    echo "Process will be Deleting the following File(s)/Directory(s):"
    echo ""
    echo "Process will be Deleting `echo "" |     grep -v '^$' | wc -l` File(s)/Directory(s)"         # "grep -v '^$'" - removes empty lines.
    # "wc -l" - Counts the number of lines
    echo ""
    if [ "" == "true" ];
    then
        if [ "" != "" ];
        then
            echo "Executing Delete Statement: $2"
            eval $2
            DELETE_STMT_EXIT_CODE=$?
            if [ "" != "0" ]; then
                echo "Delete process failed with exit code                     ''"

                echo "Removing lock file..."
                rm -f /tmp/airflow_log_cleanup_worker.lock
                if [ "" != "0" ]; then
                    echo "Error removing the lock file.                     Check file permissions.                    To re-run the DAG, ensure that the lock file has been                     deleted (/tmp/airflow_log_cleanup_worker.lock)."
                    exit 
                fi
                exit 
            fi
        else
            echo "WARN: No File(s)/Directory(s) to Delete"
        fi
    else
        echo "WARN: You're opted to skip deleting the File(s)/Directory(s)!!!"
    fi
}


if [ ! -f /tmp/airflow_log_cleanup_worker.lock ]; then

    echo "Lock file not found on this node!     Creating it to prevent collisions..."
    touch /tmp/airflow_log_cleanup_worker.lock
    CREATE_LOCK_FILE_EXIT_CODE=$?
    if [ "" != "0" ]; then
        echo "Error creating the lock file.         Check if the airflow user can create files under tmp directory.         Exiting..."
        exit 
    fi

    echo ""
    echo "Running Cleanup Process..."

    FIND_STATEMENT="find /*/* -type f -mtime       "
    DELETE_STMT=" -exec rm -f {} \;"

    cleanup "" ""
    CLEANUP_EXIT_CODE=$?

    FIND_STATEMENT="find /*/* -type d -empty"
    DELETE_STMT=" -prune -exec rm -rf {} \;"

    cleanup "" ""
    CLEANUP_EXIT_CODE=$?

    FIND_STATEMENT="find /* -type d -empty"
    DELETE_STMT=" -prune -exec rm -rf {} \;"

    cleanup "" ""
    CLEANUP_EXIT_CODE=$?

    echo "Finished Running Cleanup Process"

    echo "Deleting lock file..."
    rm -f /tmp/airflow_log_cleanup_worker.lock
    REMOVE_LOCK_FILE_EXIT_CODE=$?
    if [ "" != "0" ]; then
        echo "Error removing the lock file. Check file permissions. To re-run the DAG, ensure that the lock file has been deleted (/tmp/airflow_log_cleanup_worker.lock)."
        exit 
    fi

else
    echo "Another task is already deleting logs on this worker node.     Skipping it!"
    echo "If you believe you're receiving this message in error, kindly check     if /tmp/airflow_log_cleanup_worker.lock exists and delete it."
    exit 0
fi

[2021-04-08 11:20:28,030] {bash_operator.py:153} INFO - Output:
[2021-04-08 11:20:28,032] {bash_operator.py:157} INFO - Getting Configurations...
[2021-04-08 11:20:28,048] {bash_operator.py:157} INFO - sleep: invalid time interval ‘s’
[2021-04-08 11:20:28,048] {bash_operator.py:157} INFO - Try 'sleep --help' for more information.
[2021-04-08 11:20:28,048] {bash_operator.py:157} INFO - maxLogAgeInDays conf variable isn't included. Using Default '30'.
[2021-04-08 11:20:28,048] {bash_operator.py:157} INFO - Finished Getting Configurations
[2021-04-08 11:20:28,048] {bash_operator.py:157} INFO - 
[2021-04-08 11:20:28,048] {bash_operator.py:157} INFO - Configurations:
[2021-04-08 11:20:28,048] {bash_operator.py:157} INFO - BASE_LOG_FOLDER:      ''
[2021-04-08 11:20:28,048] {bash_operator.py:157} INFO - MAX_LOG_AGE_IN_DAYS:  ''
[2021-04-08 11:20:28,048] {bash_operator.py:157} INFO - ENABLE_DELETE:        ''
[2021-04-08 11:20:28,049] {bash_operator.py:157} INFO - Lock file not found on this node!     Creating it to prevent collisions...
[2021-04-08 11:20:28,068] {bash_operator.py:157} INFO - Error creating the lock file.         Check if the airflow user can create files under tmp directory.         Exiting...
[2021-04-08 11:20:28,068] {bash_operator.py:161} INFO - Command exited with return code 0
[2021-04-08 11:20:28,099] {taskinstance.py:1070} INFO - Marking task as SUCCESS.dag_id=clean_airflow_logs, task_id=log_cleanup_worker_num_1_dir_0, execution_date=20210407T000000, start_date=20210408T092027, end_date=20210408T092028
[2021-04-08 11:20:32,641] {local_task_job.py:102} INFO - Task exited with return code 0

@gfelot
Copy link
Author

gfelot commented Apr 8, 2021

Sorry it wasn't the right log ending.
Here it comes :

2021-04-08 13:14:30,840] {bash_operator.py:157} INFO - Lock file not found on this node!     Creating it to prevent collisions...
[2021-04-08 13:14:30,842] {bash_operator.py:157} INFO - Error creating the lock file.         Check if the airflow user can create files under tmp directory.         Exiting...
[2021-04-08 13:14:30,842] {bash_operator.py:161} INFO - Command exited with return code 0
[

And I can assure you that the file was created

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants