About
Researcher-turned-founder on a mission to ensure trust in data. Because without trust…
Activity
-
I had lunch many years ago with a fellow data exec, and just unloaded on him for an hour. I was stressed out of my mind... There was a huge…
I had lunch many years ago with a fellow data exec, and just unloaded on him for an hour. I was stressed out of my mind... There was a huge…
Liked by Kevin Hu, PhD
-
Looking forward to September 9th for Startup San Diego 1st Mondays hosted by Sharp HealthCare come get the latest on the San Diego startup scene and…
Looking forward to September 9th for Startup San Diego 1st Mondays hosted by Sharp HealthCare come get the latest on the San Diego startup scene and…
Liked by Kevin Hu, PhD
-
I’m hiring.. If Azure, Databricks and PySpark sound like fun to you, drop me a note!
I’m hiring.. If Azure, Databricks and PySpark sound like fun to you, drop me a note!
Liked by Kevin Hu, PhD
Experience
Education
Publications
-
Sherlock: A Deep Learning Approach to Semantic Data Type Detection
Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
-
VizML: A Machine Learning Approach to Visualization Recommendation
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems
-
VizNet: Towards A Large-Scale Visualization Learning and Benchmarking Repository
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems
-
DIVE: A Mixed-Initiative System Supporting Integrated Data Exploration Workflows
ACM SIGMOD Workshop on Human-in-the-Loop Data Analytics (HILDA)
Generating knowledge from data is an increasingly important activity. This process of data exploration consists of multiple tasks: data ingestion, visualization, statistical analysis, and storytelling. Though these tasks are complementary, analysts often execute them in separate tools. Moreover, these tools have steep learning curves due to their reliance on manual query specification. Here, we describe the design and implementation of DIVE, a web-based system that integrates state-of-the-art…
Generating knowledge from data is an increasingly important activity. This process of data exploration consists of multiple tasks: data ingestion, visualization, statistical analysis, and storytelling. Though these tasks are complementary, analysts often execute them in separate tools. Moreover, these tools have steep learning curves due to their reliance on manual query specification. Here, we describe the design and implementation of DIVE, a web-based system that integrates state-of-the-art data exploration features into a single tool. DIVE contributes a mixed-initiative interaction scheme that combines recommendation with point-and-click manual specification, and a consistent visual language that unifies different stages of the data exploration workflow. In a controlled user study with 67 professional data scientists, we find that DIVE users were significantly more successful and faster than Excel users at completing predefined data visualization and analysis tasks.
-
Links that speak: the global language network and its association with global fame
Proceedings of the National Academy of Sciences
Languages vary enormously in global importance because of historical, demographic, political, and technological forces. However, beyond simple measures of population and economic power, there has been no rigorous quantitative way to define the global influence of languages. Here we use the structure of the networks connecting multilingual speakers and translated texts, as expressed in book translations, multiple language editions of Wikipedia, and Twitter, to provide a concept of language…
Languages vary enormously in global importance because of historical, demographic, political, and technological forces. However, beyond simple measures of population and economic power, there has been no rigorous quantitative way to define the global influence of languages. Here we use the structure of the networks connecting multilingual speakers and translated texts, as expressed in book translations, multiple language editions of Wikipedia, and Twitter, to provide a concept of language importance that goes beyond simple economic or demographic measures. We find that the structure of these three global language networks (GLNs) is centered on English as a global hub and around a handful of intermediate hub languages, which include Spanish, German, French, Russian, Portuguese, and Chinese. We validate the measure of a language’s centrality in the three GLNs by showing that it exhibits a strong correlation with two independent measures of the number of famous people born in the countries associated with that language. These results suggest that the position of a language in the GLN contributes to the visibility of its speakers and the global popularity of the cultural content they produce.
Other authorsSee publication
More activity by Kevin
-
🔥 Exciting opportunity at Hyatt! We JUST opened a Data Product Manager role today. We are looking for someone who has AI experience and would fit…
🔥 Exciting opportunity at Hyatt! We JUST opened a Data Product Manager role today. We are looking for someone who has AI experience and would fit…
Liked by Kevin Hu, PhD
-
Easy and streaming are words that usually do not go together; we are on a mission to make them best friends. This study from IONOS is an awesome…
Easy and streaming are words that usually do not go together; we are on a mission to make them best friends. This study from IONOS is an awesome…
Liked by Kevin Hu, PhD
-
Living in Salt Lake City, we are blessed with endless outdoor activities. I can be a workaholic, so it’s nice to break up the day and get outside…
Living in Salt Lake City, we are blessed with endless outdoor activities. I can be a workaholic, so it’s nice to break up the day and get outside…
Liked by Kevin Hu, PhD
-
Excited to announce that I'll be joining Limble CMMS as a Staff Data Engineer on September 9th! In this role, I'll spearhead data architecture and…
Excited to announce that I'll be joining Limble CMMS as a Staff Data Engineer on September 9th! In this role, I'll spearhead data architecture and…
Liked by Kevin Hu, PhD
-
Why do so many data governance initiatives fail? Because the costs and benefits of data governance in all organizations are not equally shared.…
Why do so many data governance initiatives fail? Because the costs and benefits of data governance in all organizations are not equally shared.…
Liked by Kevin Hu, PhD
-
One of my most favorite Snowflake capabilities: SQL on unstructured data. Think of this as: SELECT Summary, Purpose, Sentiment FROM…
One of my most favorite Snowflake capabilities: SQL on unstructured data. Think of this as: SELECT Summary, Purpose, Sentiment FROM…
Liked by Kevin Hu, PhD
-
I am getting close to four years of putting out data engineering youtube videos and.... We are getting so close to 100k...1.8k away! Over the past…
I am getting close to four years of putting out data engineering youtube videos and.... We are getting so close to 100k...1.8k away! Over the past…
Liked by Kevin Hu, PhD
-
Snowflake's #Copilot is GA! 😎 https://lnkd.in/e9ZkKBHB #genAI #LLM #SQL #Python
Snowflake's #Copilot is GA! 😎 https://lnkd.in/e9ZkKBHB #genAI #LLM #SQL #Python
Liked by Kevin Hu, PhD
-
My strengths-based-culture presentation took a little walk over to the Upsun playlists: here's a refreshed link and a new-school-year reminder to…
My strengths-based-culture presentation took a little walk over to the Upsun playlists: here's a refreshed link and a new-school-year reminder to…
Liked by Kevin Hu, PhD
-
Running SQL queries in the cloud costs (sometimes a lot of) money. What options do we have to optimize for $$$? Arachne, a paper my student Tapan…
Running SQL queries in the cloud costs (sometimes a lot of) money. What options do we have to optimize for $$$? Arachne, a paper my student Tapan…
Liked by Kevin Hu, PhD
-
5 ways to delight a Data Analyst in your life: 1. Use the dashboard they built for you. 2. Involve them in your team's strategic planning process.…
5 ways to delight a Data Analyst in your life: 1. Use the dashboard they built for you. 2. Involve them in your team's strategic planning process.…
Liked by Kevin Hu, PhD
-
One of the most powerful ways to optimize Snowflake query performance is by improving query pruning. I’m excited to share three new ways you can…
One of the most powerful ways to optimize Snowflake query performance is by improving query pruning. I’m excited to share three new ways you can…
Liked by Kevin Hu, PhD
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named Kevin Hu, PhD
1 other named Kevin Hu, PhD is on LinkedIn
See others named Kevin Hu, PhD