Learn basic data science through trial and error.
Runs a Jupyter Notebook in a Docker container with persistent volume.
$ docker run -p 8888:8888 -v $PWD:/home/jovyan/work jupyter/base-notebook
Mathematics for machine learning:
https://mml-book.github.io/book/mml-book.pdf
if __name__ == '__main__':
unittest.main(argv=['first-arg-is-ignored'], exit=False)
$ python -m ipykernel install --user
Command r
for renaming:
{
"shortcuts": [
{
"command": "docmanager:rename",
"keys": [
"Accel R"
],
"selector": "body"
}
]
}
name = 'john doe'
!echo {name}
Bad:
!pip3 install module
Good:
import sys
!{sys.executable} -m pip install spacy
!{sys.executable} -m spacy download en_core_web_sm
Allows the notebook to be saved as markdown:
$ brew install jupyterlab
$ pip install jupytext
$ jupyter labextension install jupyterlab-jupytext
https://jupyterlab-code-formatter.readthedocs.io/en/latest/
https://www.wrighters.io/version-control-for-jupyter-notebooks/
https://www.datacamp.com/community/tutorials/dbscan-macroscopic-investigation-python
Useful for Google Colab/Kaggle:
!mkdir data
!ls data
!curl -L https://github.com/alextanhongpin/blueprints-for-text-analytics-python/blob/master/data/abcnews-date-text.csv.gz?raw=true -o data/abcnews-date-text.csv.gz
import pandas as pd
!pip install ipympl
%matplotlib widget
Alternative, which is better:
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (20, 10)
$ pip3 list --outdated --format=freeze | grep -v '^\-e' | cut -d = -f 1 | xargs -n1 pip3 install -U