Today, I’m in awe as we celebrate the 100th issue of Projects to Know. I never planned to write a newsletter. A few years ago our Director of Marketing, Malia Powers, suggested that I write a blog post on data quality. After experiencing writer’s block 3/4 through a draft, I proposed that I write a short newsletter instead. Certainly, a short newsletter would be easier to an author than a comprehensive review of data testing, monitoring, and preparation tools and technologies! What’s more, I felt frustrated by existing newsletters on data science and ML, many of which focused on attention-grabbing headlines about the impact of AI on climate change, cancer, and space exploration. I saw the need for a publication that would highlight research and projects with more practical relevance – content that data practitioners and developers would and could apply.
100 issues later, I’ve confirmed that writing a newsletter is, in fact, more challenging than writing a blog post. Nonetheless, I have zero regrets. More importantly, I’ve validated that data practitioners and developers want to learn more about the projects that their peers are building in academia and industry; as research papers, internal platforms, and OSS tools; in machine intelligence, distributed systems, and data management.
While I sometimes opine about summarizing lengthy academic papers, surfing engineering blogs, and sifting through GitHub repos every weekend; I’m so inspired by those who create and consume projects. Projects to Know has proven to me that the impact of a paper is deeper than an academic conference; the impact of an OSS project cannot be measured with GitHub stars, and the impact of internal initiatives often extends far beyond a single company. I’m motivated to continue writing by the readers from unicorn tech companies who share how they’re applying the models described in featured papers; or the creators of OSS database technologies who connect with academic collaborators through PTK.
When I started my career in data in 2009, it was a lonely profession. When I became a manager in 2012, I had so few peers to turn to for mentorship and guidance. But things have changed and now there are so many data practitioners and developers who want to communicate and collaborate; who galvanize each other to try more experimental approaches to managing teams or to reveal the skunkworks project they’ve been working on between calls and meetings. Now, a community exists and there are so many more Projects to Know.
Below, we’ve highlighted a few projects from this expansive compilation – the most popular Papers, Projects, and Content from 4 sets of issues. You’ll see that these projects span a range of topics – from privacy-preserving machine learning to literate programming to serverless prediction serving. They’re created by authors from Tennessee to Singapore and from institutions ranging from F500 companies like Nike to seed stage startups like Ponder. It’s hard to skim through this list without feeling awe – there’s just so much to learn and so many people to learn from.
I’m so excited to celebrate an ever-growing community and corpus of knowledge today. Thanks for your contributions, readership, and support.
Not a subscriber? Subscribe here and get 3 academic papers and 3 open source projects that are playing a meaningful role in advancing machine intelligence and data science in your inbox on a weekly basis! You can view past issues here.