Skip to content
View sim2000dg's full-sized avatar
  • Rome

Block or report sim2000dg

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
sim2000dg/README.md

Hi everyone, I am Simone Di Gregorio ✌️

About me

I hold a bachelor's degree in Management and Computer Science from LUISS and I have gratuated with highest honors from the Data Science Master's Degree @Sapienza. I am now a PhD Student in Data Science @Sapienza.

Through university and (a lot of) self-studying I have a solid background in data science, from simple ETL to modelling. In particular, I have in-depth knowledge of:

  • R for machine learning, modelling, statistics, reporting (R Markdown) and data manipulation, exploiting the tidyverse ecosystem far more than the base language.
  • Python for scripting, manipulation, modelling and web scraping. Specifically, my experience revolves mostly around Pandas, NumPy, scikit and Tensorflow. Experience with Tensorflow has been both with Keras and more low-level APIs.
  • KNIME Analytics Platform and KNIME Server, now Business Hub, due to university projects and work experience. Specifically, I am L1, L2 and L3 certified.
  • Relational paradigm for databases and SQL.

I am a former Data Science Intern @KNIME, the software company behind KNIME Analytics Platform (and its enterprise version), a popular and powerful low-code tool to perform data science tasks, at every level. As an employee, I developed KNIME native low-code approaches for the Word2Vec complete pipeline and I also developed a fast new Python-based Word2Vec node based on Tensorflow, using a mix of low-level APIs (mainly for the pre-processing) and Keras for the modelling steps. The code for the node is publicly available in one of my repositories, at this link.

Until recently, I also was a Teaching Assistant in Statistical Methods for Data Science and Laboratory, one of the main courses, spanning two semesters, in the Data Science Master’s Degree @Sapienza, dealing with an introduction to Probability Theory before delving into Frequentist and Bayesian Inference. My interests are mainly in statistical learning and probability theory for stochastic processes and stochastic calculus.

How to reach me

For all things Data Science related you can contact me with:

Popular repositories Loading

  1. omds_project omds_project Public

    Repository for Optimization Methods for Data Science Final Project / Data Science @Sapienza

    Python 1

  2. DynamicGeoCells DynamicGeoCells Public

    This package helps you building geocells automatically based on the point density of your dataset.

    Python 1 1

  3. Word2VecPyNodeTF Word2VecPyNodeTF Public

    Python-based KNIME node implementing Word2Vec algorithms with Tensorflow

    Python 3

  4. homework1_ADM homework1_ADM Public

    Repository for the first homework of Algorithmic Methods for Data Mining @Sapienza

    Python

  5. adm_hw2-group30 adm_hw2-group30 Public

    Repository for 2nd homework of ADM/Data Science @Sapienza

    HTML 1

  6. hw3_group28_ADM hw3_group28_ADM Public

    Repository for the third homework of Algorithmic Methods of Data Mining and Laboratory / Data Science@Sapienza

    Jupyter Notebook 2