Owen O'Malley
@owen_omalley
Previously a founder at Hortonworks. I've worked on HDFS, Hive, Iceberg, MapReduce, ORC, and Security.
ID:25039604
http://fosstodon.org/@omalley 18-03-2009 06:46:43
1,4K Tweets
6,2K Followers
628 Following
At ApacheCon talk by @mmthakur1203 from Cloudera about adding the async vectored io read api to Apache Hadoop and Apache ORC. Improves TPC/DS benchmark performance using ORC by 20-40%.
Hi Apache Spark friends, we've been working a bit on putting together a flow-chart to help you debug slow or failing jobs you can see it at - buff.ly/3nEa1Fc & PRs are very very welcome - buff.ly/3FEOJ0l
It is great having public access to the NYC taxi data (from NYC Taxi & Limousine Commission ). I’ve been using it for years to benchmark and improve the open source big data tools. Unfortunately, they’ve moved to storing the money columns as doubles. Never ever use floats or doubles for storing money.
We operate Trino at an insane scale and we’re just getting started. We have an exciting roadmap, including moving to Azure. I have several openings at all levels on my team. As an added perk, you will get to partner with Martin Traverso, Dain Sundstrom, David Phillips and community!
This is a LinkedIn Alumni week! Launch of StarTree (@KishoreBytes). Unveiling of Acryl Data (@shirshanka) and of course Confluent IPO (@jaykreps Jun Rao Neha Narkhede). Congratulations all!