Grow your YouTube channel like a PRO with a free tool
Get Free YouTube Subscribers, Views and Likes

DuckDB: Bringing analytical SQL directly to your Python shell (EuroPython 2023)

Follow
DuckDB

Website
https://ep2023.europython.eu/session/...

Speaker
Pedro Holanda

Abstract
In this talk, we will present DuckDB. DuckDB is a novel data management system that executes analytical SQL queries without requiring a server. DuckDB has a unique, indepth integration with the existing PyData ecosystem. This integration allows DuckDB to query and output data from and to other Python libraries without copying it. This makes DuckDB an essential tool for the data scientist. In a live demo, we will showcase how DuckDB performs and integrates with the most used Python datawrangling tool, Pandas. Besides learning about DuckDB's main charactestics, users will also experience a live demo of DuckDB and Pandas in a typical data science scenario, focusing on comparing their performance and usability while showcasing their cooperation. The demo is most interesting for an audience familiar with Python, the Pandas API, and SQL.

Description
The talk is catered primarily towards data scientists and data engineers. The talk aims to familiarize users with the design differences between Pandas and DuckDB and how to combine them to solve their datascience needs. We will have an overview about five main characteristics of DuckDB. 1) Vectorized Execution Engine, 2) Endtoend Query Optimization, 3) Automatic Parallelism, 4) Beyond Memory Execution, and 5) Data Compression.

posted by quimista63