DataExpert.io

DataExpert.io

Education

San Francisco, California 18,200 followers

Data Engineering education, solutions, and evangelism

About us

EcZachly Inc is a company dedicated to inspiring and educating the next generation of data talent!

Website
https://www.dataexpert.io
Industry
Education
Company size
2-10 employees
Headquarters
San Francisco, California
Type
Privately Held
Founded
2023

Locations

Employees at DataExpert.io

Updates

  • DataExpert.io reposted this

    View profile for Ari Kaplan, graphic

    Head of Evangelism at Databricks; Caltech Alumni of the Decade; DataIQ Top 20 Influencers in Data; Creator of Chicago Cubs Analytics Department

    Zach Wilson founded DataExpert.io, focused on education on data and AI, including data engineering bootcamps - and gaining over 750k followers on social! He expounds on hot trends: data quality & in the world of GenAI and unstructured space ; coaching for data engineering interviews ; shifts in the job market. Getting a job now is different than the old "you've heard of spark? You're in!" - listen and share this #livefromthelakehouse episode now! With Holly Smith and Kobie Crawford #spark #databricks Databricks #data #dataengineering https://lnkd.in/gs2wV5uF

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • DataExpert.io reposted this

    View profile for Zach Wilson, graphic
    Zach Wilson Zach Wilson is an Influencer

    Founder @ DataExpert.io | YouTube: Data with Zach | ADHD | contact: [email protected]

    Daniel and I are teaching a live 6-week full stack engineering boot camp! For July 4th, we're offering 30% off to the first 25 people who buy before July 7th at 11:59 PM Pacific time! This boot camp will take you from JavaScript beginner to shipping production-grade web apps! You can buy here: FullStackExpert.io/july4 #softwareengineering

    • No alternative text description for this image
  • DataExpert.io reposted this

    View profile for Jesse C., graphic

    So.. the day has finally come... As part of the DataExpert.io course, I decided to participate in the Capstone project. For my project, I wanted to explore economic data and really learn more about how that data can be utilized to inform economic analysis. As I have teased before, I named this the Rosy Economix tool, and you can check it out for yourself @ https://lnkd.in/gWNWJ3NY I had a great time working on this project, and while economically speaking, I have a lot to learn 😁 - I found the technical aspects to be really interesting and insightful. I was able to work with a lot of different technologies during this project, including: - Explore stock data using Yahoo Finance API - Explore economic data by querying the FRED (St Louis Federal Reserve) - Seed and model the various datasets using Data Build Tool! (dbt-core) - Refreshed myself a bit with Airflow (using Astronomer! Suuuwweeeettt! 😍 ) - Developed a fastAPI backend to serve up the modeled data - Developed a ReactJS ChartJS TailwindCSS ViteJS frontend - Modeled the dbt manifest data, and then delivered that to ReactFlow 🤓 🤩 - Developed a proof-of-concept interaction with OpenAI API to have chart analysis and conversational responses 🤘 🚀 - Built environment / compose stack using Astro (and Docker overrides!) Whew... that was an intense six weeks :-) - but I had a blast, I learned a lot, and got to immerse myself in a wide range of technologies. Have a look at the repo - I'm sure its not perfect, but considering its six weeks in the making and I was also busy with class and life - I think it turned out ok! https://lnkd.in/gi3ynPXR And while I'm itching to push forward, thinking a little bit of a break before I figure out the next steps for where I'd like to take this project for a v2! Astronomer / dbt Labs / OpenAI / DataExpert.io

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • DataExpert.io reposted this

    View profile for Will Bassett, graphic

    Software Engineer at Frenalytics | Data | Mental Health Advocate | 🏳️🌈

    Analytical Patterns 🧩 Patterns aren't unique to analytics, they're very prevalent across many branches of engineering -- they're reusable solutions to solve common problems in a particular context. So, analytical patterns are just repeatable analyses! The patterns I'll highlight come in three flavors: Aggregation-based, Cumulation-based, and Window-based. Aggregation-based patterns: - Trend analysis - Root cause analysis Cumulation-based patterns: - State transition tracking - Retention tracking or Survival analysis (with J curves, see below) Window-based patterns: - Day over day, week over week, month over month, year over year - Rolling sum / average - Ranking Understanding how to implement these patterns leads to decreased development time and horizontal impact. These patterns can be turned into frameworks to help the rest of your team work more efficiently. It also reduces cognitive load, you'll start to realize, "oh wait, I can use <insert pattern here> pattern," instead of toiling over the SQL. Thanks Zach Wilson for providing a great overview of these patterns and their use-cases in depth in the DataExpert.io bootcamp. Oh, and don't forget to look where the holes are NOT! :) #dataengineering

    • No alternative text description for this image
  • DataExpert.io reposted this

    View profile for Will Bassett, graphic

    Software Engineer at Frenalytics | Data | Mental Health Advocate | 🏳️🌈

    Apache Spark ✨ Spark is a distributed compute framework that allows you to process VERY large amounts of data efficiently. Why is it so good? - It leverages RAM much more effectively than previous iterations - It's storage agnostic, decouples storage and compute - Has a massive community. Any problem you've had, someone else has probably already seen it! How does Spark work? It has three main components: The Plan: - Transformations you describe in Python, Scala, or SQL - It gets executed lazily meaning it only gets executed when it needs to happen. The Driver: - Reads the plan - Determines when to actually start executing the job, how to join datasets, and how much parallelism each step needs. The Executors: - The machines that actually do the work Types of JOINS in Spark: - Shuffle Sort-Merge join - Broadcast Hash Join - Bucket join There was so much great information in today's lecture from Zach Wilson in the DataExpert.io boot camp. Can't wait to deeper into Spark throughout the week.

  • DataExpert.io reposted this

    View profile for Amith Singh, graphic

    Senior Data Engineer | Writing About Data Engineering & Career Growth

    it was a bright Friday morning when I found myself in the dentist's chair, mouth agape and bracing for the inevitable pain of root canal procedure. As the dentist began drilling and the sharp sensations shot through my jaw, my mind - ever the restless wanderer - decided to seek refuge in more pleasant throughts such as Zach Wilson 's DataExpert.io bootcamp V4 😀 Amidst the cacophony of high-pitched whirs, my brain fixated on the dentists meticulous attention to detail that made me ponder the intricacies of Week1 dimensional data modelling, Cumulative Table Design. I found myself mentally designing a robust table, complete all the joins, structs, arrays, maps etc... The thought of building efficient dbt transformations to power insightful dashboards and reports temporarily distracted me from the uncomfortable reality unfolding in my mouth. By the time the dentist had finished and was stitching me up, I had practically outlined an entire data engineering roadmap in my head. #data #DataExpert.io #dataengineering #dataengineers #dataengineer

  • DataExpert.io reposted this

    View profile for Will Bassett, graphic

    Software Engineer at Frenalytics | Data | Mental Health Advocate | 🏳️🌈

    Data Quality ✅ This is extremely important in data engineering because it ensures that data is reliable and trustworthy for downstream consumers. Here are a few components of ensuring good data quality. Awareness: - If no one knows your dataset exists it has no value. - Cataloging can solve this issue. Usability: - Partially solved by data modeling, think usability vs compactness tradeoff. - Consider what your downstream consumer needs. A dashboard? Datatable? Correctness: - Proper validation before release, someone else (an analyst) should validate your data! - Automated data quality checks (see graphic below). Comprehensiveness: - Document all gaps!!! Explain what your data cannot do. - Find ALL the relevant data, completeness is important. Reliability: - Use efficient design patterns. - Use streaming when it makes sense, skew probability is lower in smaller time frames. - Set a Service Level Agreement (SLA) for when data is expected to arrive to consumers. Affordability: - Leverage compression the right way, PARQUET is great. - Minimize I/O with incremental builds. - Sample data sets when smaller samples don't affect the metric with statistical significance. I love how Zach Wilson teaches with real-life experience to emphasize the importance of having good quality data. DataExpert.io continues to deliver great content on how to stand out as an engineer! #DataEngineering #DataExpertio #DataQuality

    • No alternative text description for this image
  • DataExpert.io reposted this

    And the hits keep coming!!! Zach and Careerflow.ai have partnered for DataExpert.io v4 and surprised us with a 3 month subscription! - just finished a demo with Puneet Kohli using this amazing tool. Lots of great features for helping job seekers identify and pursue opportunities, manage applications and track contacts and engagements. Another great reason to join this program! Stay tuned as I'll definitely be testing out that AI LinkedIn post generator! 🤣 Thanks to the folks at DataExpert.io and Careerflow.ai for putting this together! Zach Wilson / Carly Taylor, M.Sc. / Stephanie Nuesi

    • No alternative text description for this image
  • DataExpert.io reposted this

    View profile for Grisell R., graphic

    Data Engineer | Machine Learning | Ex-Google

    Being part of Data Engineering organized by Zach Wilson DataExpert.io has being an awesome journey until now. Thanks for the very insightful course on DBT by Bruno Souza de Lima, I want to learn and get hands-on since a long time. Extra resources that he mentioned and bring my attention: Great expectations is a little bit controversial for Data Quality checks but the integration with DBT is worthy to review. https://lnkd.in/esG4UKzq Terraform and Dbt: https://lnkd.in/eyi5ftrP

    View profile for Bruno Souza de Lima, graphic

    dbt Tech Lead @ phData | dbt Community Spotlight 🌟 and Instructor | Follow me for daily dbt content! 🔶

    🔥 Yesterday I had the opportunity to teach dbt in Zach Wilson's Data Engineering boot camp! It was awesome to talk about dbt with hundreds of students around the world! I still have a lot to improve, but I feel like I'm getting better at explaining dbt each time. One thing never changes though, live demos will always surprise you with unexpected errors 😆 On this first day, I covered: - What is dbt? What problems it solves - Models - Tests (data and unit tests) - Sources (and freshness) - Seeds - Snapshots - Packages - Linting - Target (run/compiled) - Documentation - Commands (run, test, seed, snapshot, build, show, debug, docs...) - dbt demo (building a full dbt project) For the second day, I plan to cover - Incremental - Macros and variables - Contracts - Advanced Pipelines Thanks, Zach Wilson for this opportunity, and all students for attending! #dataengineering #dataengineer #analyticsengineer #analyticsengineering

    • No alternative text description for this image

Similar pages