I'm a self-taught Python coder with a diverse skill set in Data Science and AI, currently working as a Software Development Engineer I at Sidetrek, a US-based AI startup. My passion lies in becoming a data scientist who seeks to play a pivotal role in transforming data into actionable insights and contributing to AI research within the organization, drawing upon my technical expertise, entrepreneurial background, and innovative mindset.
- Language:
- Python
- SQL
- R
- Machine Learning:
- Classification, Regression, Clustering, Time-series Forecasting, Anomaly Detection, Recommender Engine
- Libraries:
- Scikit-learn,
- XGBoost,
- Catboost,
- LightGBM,
- PyCaret,
- Prophet,
- ARIMA & SARIMA
- Feature Engineering: Feature scaling, feature encoding, binning or discretization, feature extraction, etc.
- Data Preprocessing: Data cleaning, data imputation, normalization, and transformation skills using libraries like Pandas, Scikit-learn, etc.
- Model Evaluation & Improvement: Model evaluation through cross-validation, metrics (e.g., accuracy, F1 score), and ROC curves, and hyperparameter tuning using GridSearchCV, RandomizedSearchCV and Bayesian optimization (using Optuna)
- Model Interpretability: Explaining models using XAI tools like SHAP, Facets and LIME.
- Natural Language Processing (NLP):
- Libraries:
- SpaCy
- NLTK
- Hugging Face Transformers
- Libraries:
- Computer Vision:
- Libraries:
- OpenCV
- YOLO
- Libraries:
- Data Versioning: Data version control using DVC
- Model Management: Experiment tracking and model versioning using MLFlow
- Model Deployment: Containerization using Docker, creating web apps using Flask, Streamlit, deploying models as REST APIs using FastAPI, BentoML, etc.
- Neural Network:
- ANN, CNN, RNN, LSTM
- Libraries:
- PyTorch
- BI Tools:
- Tableau
-
I'm a 3x Kaggle Expert. Visit my Kaggle Profile
-
I post Data Science and ML related writeups on my Medium blog.
- ๐ฎ [email protected]