Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_parquet swallows NoCredentialsError #27679

Closed
languitar opened this issue Jul 31, 2019 · 7 comments · Fixed by #33645
Closed

to_parquet swallows NoCredentialsError #27679

languitar opened this issue Jul 31, 2019 · 7 comments · Fixed by #33645
Labels
Bug Error Reporting Incorrect or improved errors from pandas IO Parquet parquet, feather
Milestone

Comments

@languitar
Copy link

Code Sample, a copy-pastable example if possible

try: 
   pd.DataFrame({'foo': [None, ['foo', 'bar']]}).to_parquet('s3://foo/bar') 
except Exception as e: 
   print('Here')                                                                                                                                                                                                            

Problem description

Without credentials configured, the above code does not write any output, but also doesn't end up in the exception handling code. Instead, on stdout or stderr, an exception is printed:

Exception ignored in: <function AbstractBufferedFile.__del__ at 0x7fe0ae8db440>
Traceback (most recent call last):
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/fsspec/spec.py", line 1137, in __del__
    self.close()
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/fsspec/spec.py", line 1114, in close
    self.flush(force=True)
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/fsspec/spec.py", line 986, in flush
    self._initiate_upload()
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/s3fs/core.py", line 951, in _initiate_upload
    Bucket=self.bucket, Key=self.key, ACL=self.acl)
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/s3fs/core.py", line 939, in _call_s3
    **kwargs)
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/s3fs/core.py", line 182, in _call_s3
    return method(**additional_kwargs)
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/client.py", line 648, in _make_api_call
    operation_model, request_dict, request_context)
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/client.py", line 667, in _make_request
    return self._endpoint.make_request(operation_model, request_dict)
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/endpoint.py", line 102, in make_request
    return self._send_request(request_dict, operation_model)
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/endpoint.py", line 132, in _send_request
    request = self.create_request(request_dict, operation_model)
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/endpoint.py", line 116, in create_request
    operation_name=operation_model.name)
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/hooks.py", line 356, in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/hooks.py", line 228, in emit
    return self._emit(event_name, kwargs)
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/hooks.py", line 211, in _emit
    response = handler(**kwargs)
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/signers.py", line 90, in handler
    return self.sign(operation_name, request)
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/signers.py", line 157, in sign
    auth.add_auth(request)
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/auth.py", line 425, in add_auth
    super(S3SigV4Auth, self).add_auth(request)
  File "/home/languitar/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/auth.py", line 357, in add_auth
    raise NoCredentialsError
botocore.exceptions.NoCredentialsError: Unable to locate credentials

Expected Output

Here

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]
INSTALLED VERSIONS

commit : None
python : 3.7.4.final.0
python-bits : 64
OS : Linux
OS-release : 4.19.61-1-lts
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 0.25.0
numpy : 1.17.0
pytz : 2019.1
dateutil : 2.8.0
pip : 19.2.1
setuptools : 41.0.1
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : 1.1.8
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.10.1
IPython : 7.7.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.14.1
pytables : None
s3fs : 0.3.1
scipy : None
sqlalchemy : None
tables : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : 1.1.8

@TomAugspurger
Copy link
Contributor

Based on Exception ignored in: <function AbstractBufferedFile.__del__ at 0x7fe0ae8db440>

this is most likely an issue in s3fs / fsspec, right? Or do you think it's in how pandas is calling things?

@languitar
Copy link
Author

I have no idea who is responsible for this.

@TomAugspurger TomAugspurger added the Needs Info Clarification about behavior needed to assess issue label Aug 1, 2019
@TomAugspurger
Copy link
Contributor

TomAugspurger commented Aug 1, 2019 via email

@languitar
Copy link
Author

Using S3FileSystem directly with anon=False and no credentials correctly raises the exception. So it must be something around in pandas that is swallowing this:

In [4]: s3fs.S3FileSystem(anon=False).open('/test/bar')                                                                                                                                                             
---------------------------------------------------------------------------
NoCredentialsError                        Traceback (most recent call last)
<ipython-input-4-04d546643798> in <module>
----> 1 s3fs.S3FileSystem(anon=False).open('/test/bar')

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/fsspec/spec.py in open(self, path, mode, block_size, **kwargs)
    658             ac = kwargs.pop('autocommit', not self._intrans)
    659             f = self._open(path, mode=mode, block_size=block_size,
--> 660                            autocommit=ac, **kwargs)
    661             if not ac:
    662                 self.transaction.files.append(f)

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/s3fs/core.py in _open(self, path, mode, block_size, acl, version_id, fill_cache, cache_type, autocommit, **kwargs)
    301                       version_id=version_id, fill_cache=fill_cache,
    302                       s3_additional_kwargs=kw, cache_type=cache_type,
--> 303                       autocommit=autocommit)
    304 
    305     def _lsdir(self, path, refresh=False, max_items=None):

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/s3fs/core.py in __init__(self, s3, path, mode, block_size, acl, version_id, fill_cache, s3_additional_kwargs, autocommit, cache_type)
    913         self.s3_additional_kwargs = s3_additional_kwargs or {}
    914         super().__init__(s3, path, mode, block_size, autocommit=autocommit,
--> 915                          cache_type=cache_type)
    916         if self.writable():
    917             if block_size < 5 * 2 ** 20:

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/fsspec/spec.py in __init__(self, fs, path, mode, block_size, autocommit, cache_type, **kwargs)
    853         if mode == 'rb':
    854             if not hasattr(self, 'details'):
--> 855                 self.details = fs.info(path)
    856             self.size = self.details['size']
    857             self.cache = caches[cache_type](self.blocksize, self._fetch_range,

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/s3fs/core.py in info(self, path, version_id)
    472             except ParamValidationError as e:
    473                 raise ValueError('Failed to head path %r: %s' % (path, e))
--> 474         return super().info(path)
    475 
    476     def ls(self, path, detail=False, refresh=False, **kwargs):

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/fsspec/spec.py in info(self, path, **kwargs)
    466         """
    467         path = self._strip_protocol(path)
--> 468         out = self.ls(self._parent(path), detail=True, **kwargs)
    469         out = [o for o in out if o['name'].rstrip('/') == path]
    470         if out:

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/s3fs/core.py in ls(self, path, detail, refresh, **kwargs)
    490         """
    491         path = self._strip_protocol(path).rstrip('/')
--> 492         files = self._ls(path, refresh=refresh)
    493         if not files:
    494             files = self._ls(self._parent(path), refresh=refresh)

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/s3fs/core.py in _ls(self, path, refresh)
    428             return self._lsbuckets(refresh)
    429         else:
--> 430             return self._lsdir(path, refresh)
    431 
    432     def exists(self, path):

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/s3fs/core.py in _lsdir(self, path, refresh, max_items)
    319                 files = []
    320                 dircache = []
--> 321                 for i in it:
    322                     dircache.extend(i.get('CommonPrefixes', []))
    323                     for c in i.get('Contents', []):

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/paginate.py in __iter__(self)
    253         self._inject_starting_params(current_kwargs)
    254         while True:
--> 255             response = self._make_request(current_kwargs)
    256             parsed = self._extract_parsed_response(response)
    257             if first_request:

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/paginate.py in _make_request(self, current_kwargs)
    330 
    331     def _make_request(self, current_kwargs):
--> 332         return self._method(**current_kwargs)
    333 
    334     def _extract_parsed_response(self, response):

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    355                     "%s() only accepts keyword arguments." % py_operation_name)
    356             # The "self" in this scope is referring to the BaseClient.
--> 357             return self._make_api_call(operation_name, kwargs)
    358 
    359         _api_call.__name__ = str(py_operation_name)

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    646         else:
    647             http, parsed_response = self._make_request(
--> 648                 operation_model, request_dict, request_context)
    649 
    650         self.meta.events.emit(

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/client.py in _make_request(self, operation_model, request_dict, request_context)
    665     def _make_request(self, operation_model, request_dict, request_context):
    666         try:
--> 667             return self._endpoint.make_request(operation_model, request_dict)
    668         except Exception as e:
    669             self.meta.events.emit(

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/endpoint.py in make_request(self, operation_model, request_dict)
    100         logger.debug("Making request for %s with params: %s",
    101                      operation_model, request_dict)
--> 102         return self._send_request(request_dict, operation_model)
    103 
    104     def create_request(self, params, operation_model=None):

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/endpoint.py in _send_request(self, request_dict, operation_model)
    130     def _send_request(self, request_dict, operation_model):
    131         attempts = 1
--> 132         request = self.create_request(request_dict, operation_model)
    133         context = request_dict['context']
    134         success_response, exception = self._get_response(

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/endpoint.py in create_request(self, params, operation_model)
    114                 op_name=operation_model.name)
    115             self._event_emitter.emit(event_name, request=request,
--> 116                                      operation_name=operation_model.name)
    117         prepared_request = self.prepare_request(request)
    118         return prepared_request

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/hooks.py in emit(self, event_name, **kwargs)
    354     def emit(self, event_name, **kwargs):
    355         aliased_event_name = self._alias_event_name(event_name)
--> 356         return self._emitter.emit(aliased_event_name, **kwargs)
    357 
    358     def emit_until_response(self, event_name, **kwargs):

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/hooks.py in emit(self, event_name, **kwargs)
    226                  handlers.
    227         """
--> 228         return self._emit(event_name, kwargs)
    229 
    230     def emit_until_response(self, event_name, **kwargs):

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/hooks.py in _emit(self, event_name, kwargs, stop_on_response)
    209         for handler in handlers_to_call:
    210             logger.debug('Event %s: calling handler %s', event_name, handler)
--> 211             response = handler(**kwargs)
    212             responses.append((handler, response))
    213             if stop_on_response and response is not None:

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/signers.py in handler(self, operation_name, request, **kwargs)
     88         # this method is invoked to sign the request.
     89         # Don't call this method directly.
---> 90         return self.sign(operation_name, request)
     91 
     92     def sign(self, operation_name, request, region_name=None,

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/signers.py in sign(self, operation_name, request, region_name, signing_type, expires_in, signing_name)
    155                     raise e
    156 
--> 157             auth.add_auth(request)
    158 
    159     def _choose_signer(self, operation_name, signing_type, context):

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/auth.py in add_auth(self, request)
    423         self._region_name = signing_context.get(
    424             'region', self._default_region_name)
--> 425         super(S3SigV4Auth, self).add_auth(request)
    426 
    427     def _modify_request_before_signing(self, request):

~/.pyenv/versions/analytics-3.7/lib/python3.7/site-packages/botocore/auth.py in add_auth(self, request)
    355     def add_auth(self, request):
    356         if self.credentials is None:
--> 357             raise NoCredentialsError
    358         datetime_now = datetime.datetime.utcnow()
    359         request.context['timestamp'] = datetime_now.strftime(SIGV4_TIMESTAMP)

NoCredentialsError: Unable to locate credentials

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Aug 1, 2019 via email

@languitar
Copy link
Author

But that doesn't work for the write in my case. So nothing is written and no exception is raised.

@jbrockmendel jbrockmendel added the IO Parquet parquet, feather label Aug 3, 2019
@simonjayhawkins simonjayhawkins added Error Reporting Incorrect or improved errors from pandas and removed Needs Info Clarification about behavior needed to assess issue labels Apr 1, 2020
@simonjayhawkins simonjayhawkins added this to the Contributions Welcome milestone Apr 1, 2020
@alimcmaster1
Copy link
Member

Likely related to #32486 #32470

@mroeschke mroeschke added the Bug label Apr 19, 2020
@jreback jreback modified the milestones: Contributions Welcome, 1.1 Apr 20, 2020
@simonjayhawkins simonjayhawkins modified the milestones: 1.1, 1.0.4 May 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Error Reporting Incorrect or improved errors from pandas IO Parquet parquet, feather
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants