For information on upcoming conferences, visit [ Ссылка ].
Mapping Ever Larger Data with PostGIS, DuckDB, GeoArrow and deck.gl by Jared P. Lander
Presentation Slides: [ Ссылка ]
Abstract: The volume of spatial data available to analyze is getting larger and larger every year. Fortunately, the tools used to analyze these data are improving at a faster pace. During this talk we will look at four key aspects of the geospatial pipeline. We start with storing the data efficiently using Postgres with the PostGIS and TimeScaleDB extensions installed for smart partitioning. Then we perform various spatial queries using the DuckDB query engine while the data are still in Postgres. After that we use DuckDB to quickly extract the data from Postgres into GeoArrow to enable columnar operations. Finally, we visualize large scale data with the high performance deck.gl library, including filtering and aggregating data on the fly with Arquero. All those steps together make for a high performance geo workflow on large data.
Bio: Jared P. Lander is Chief Data Scientist of Lander Analytics, the Organizer of the New York Open Statistical Programming Meetup and the New York R Conference and an Adjunct Professor of Statistics at Columbia University. With a masters from Columbia University in statistics and a bachelors from Muhlenberg College in mathematics, he has experience in both academic research and industry. Jared oversees the long-term direction of the company and acts as Lead Data Scientist, researching the best strategy, models and algorithms for modern data needs. This is in addition to his client-facing consulting and training. He specializes in data management, multilevel models, machine learning, generalized linear models, data management, visualization and statistical computing. He is the author of R for Everyone, the best-selling book about R Programming geared toward Data Scientists and Non-Statisticians alike. The book is available from Amazon, Barnes & Noble and InformIT. The material is drawn from the classes he teaches at Columbia and is incorporated into his corporate training. Very active in the data community, Jared is a frequent speaker at conferences, universities and meetups around the world. His writings on statistics can be found at jaredlander.com.
Twitter: [ Ссылка ]
Presented at the 2024 Government & Public Sector R Conference (October 29, 2024)
Hosted by Lander Analytics ([ Ссылка ])
Ещё видео!