Page MenuHomePhabricator

NEW BUG REPORT SSL certificate verification error when using internal API endpoints from conda-analytics and Jupyter on stat host
Closed, ResolvedPublicBUG REPORT

Description

Data Platform Engineering Bug Report or Data Problem Form.

Please fill out the following

Please ensure you set priority

What kind of problem are you reporting?
  • Access related problem
  • Service related problem
  • Data related problem
For an access related problem
  • What is the system you are using? JupyterHub, conda-analytics, Jupyter notebook with Python kernel
  • What is the data or dashboard you are unable to access? Please include links and screenshots where applicable.
  • What is your ldap user name? bearloga
For a service related problem:
What is the nature of the issue?

Continuing from Slack discussion about internal API and my Python usage notes where @Clement_Goubert shared that verify=False because the certificate is internally valid. However, this results in SSL certificate verification error. This also affects the ability to route mwapi Python package to use the internal endpoint (and gain the benefits enumerated in T300977#7700803).

curl -H 'Host: en.wikipedia.org' https://mw-api-int-ro.discovery.wmnet:4446/wiki/Special:BlankPage

works great, so they think it's this issue: https://stackoverflow.com/questions/34931378/certificate-verification-when-using-virtual-environments

What are the steps to reproduce the issue?

Run the following in a Jupyter notebook with Python kernel in a conda-analytics environment:

import requests

url = 'https://mw-api-int-ro.discovery.wmnet:4446/w/api.php'

headers = {'Host': 'en.wikipedia.org'}

payload = {
    'action': 'query',
    'prop': 'info',
    'titles': 'R_(programming_language)|Python_(programming_language)',
    'format': 'json'
}

resp = requests.get(url, headers=headers, params=payload).json()
What happens?
SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)
What should happen instead?

No errors issuing the request to the internal API and receiving a successful response.

For the DE Team to fill out

Which systems does this effect?
  • Hive
  • Druid
  • Superset
  • Turnilo
  • WikiDumps
  • Wikistats
  • Airflow
  • HDFS
  • Goblin
  • Scqoop
  • Dashiki
  • DataHub
  • Spark
  • Jupyter
  • Modern Event Platform
  • Event Logging
  • Other
Impact Assessment:

Does this problem qualify as an incident?

  • Yes
  • No

Does this violate an SLO?

  • Yes
  • No
Value CalculatorRank
Will this improve the efficiency of a teams workflow?1-3
Does this have an effect of our Core Metrics?1-3
Does this align with our strategic goals?1-3
Is this a blocker for another team?1-3

Event Timeline

mpopov created this task.

Please could you try again with the following environment variable set?

export REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt

...or set it somehow in your notebook. Let us know if it makes a difference.

mpopov claimed this task.
import os

os.environ['REQUESTS_CA_BUNDLE'] = '/etc/ssl/certs/ca-certificates.crt'

works! Awesome, thank you so much! And I just confirmed this makes mwapi usable internally too.

I documented this at: https://meta.wikimedia.org/wiki/User:MPopov_(WMF)/Notes/Internal_API_requests#Python

Great! I'm glad it worked for you. It's frustrating that we still have to do it.

I remember seeing this first when we used anaconda-wmf here. T306197: Fix anaconda-wmf's setting of REQUESTS_CA_BUNDLE

There's got to be a way to fix it once and for all, but at least you have a workaround until then.

It wouldn't fix it for anything but conda-analytics but you could add that environment variable to /opt/conda-analytics/etc/profile.d/conda.sh?