Building the Data Quality Community
TL, DR: I’m launching a new community around Data Quality called Data Quality Camp. It has some of my favorite people in it. It would be great if you joined.
This will be a different kind of post. (If you don’t care about my philosophical waxing, skip to the end for more details on the community)
If you follow me on LinkedIn, you probably know that for the last two years I’ve been talking consistently about data, data quality, experimentation, and machine learning. I believe community-oriented education is one of the most valuable exercises any tech professional can engage in, both in service of the community you support and as a mechanism for self-improvement.
I’ve been involved in tech community development since 2017. 5 years ago I worked in a field called Conversion Optimization, which applied data science techniques like A/B Testing to marketing problems. One of the leading forums at the time was CXL which had and still maintains a Facebook group of over 11,000 members.
Because I specialized in a niche skill - working on large-scale Experimentation platforms - I was in a unique position to help others earlier in their journey towards CO mastery, as the concept of applied science at scale was still new outside of Big Tech. Every night I’d hang around the Facebook group after work and answer questions ranging from ‘how do I influence my boss to try running their first experiment’ to debugging experiment results in common popular platforms like Adobe Target or Google Optimize.
While helping people was rewarding in and of itself, I found that the most meaningful professional growth I experienced came as an outcome of freely giving knowledge to others. My first-ever large-scale speaking engagement happened because Peep Laja, who ran CXL and maintained the FB channel felt I was adding enough value to take a marquee spot at his annual conference. Thanks to there being a few other conference organizers in attendance who liked my talk, I was brought into a speaking circuit where I eventually crossed paths with former Microsoft CVP and Experimentation legend Ronny Kohavi. When I found myself in San Francisco for a PM role at an eCommerce company, Ronny offered me a spot on the AI Platform team at Microsoft which supercharged my career in data infrastructure.
But really, my inclination for education goes back even further. Both my parents are teachers (or were, in my retired dad’s case) at middle and high school in suburban Georgia. My first real job was as a martial instructor for teenagers at $10 an hour. My second was teaching graphic design through an online course I’ve since forgotten the name of. In both cases, the successes we had stemmed as much from the community of passionate supporters as they did from me.
When you’ve been in or around education for a long enough time you begin to see patterns emerge. One pattern that’s become clear to me, and magnified through my time spent as an investor and advisor for data startups is the importance of community as an engine that drives culture shift through personal connections. The dbt slack and Locally Optimistic are thematically relevant examples of this, but others like #measure chat have been leading in the marketing analytics space for the past 8-9 years to great effect. The culture of walking into a conference and recognizing 50 people you’ve interacted with online makes real-world events less like attempting to break the ice with strangers on a first date and more like meeting colleagues who you are prepared to challenge, debate, or discuss with straight away.
What I’ve observed in great communities, is that well-maintained centralized knowledge repositories can be more than a place you check in once in a while to read yet another vendor announcement. They can build entire careers and help drive change across the industry itself, acting as a sort of metaphorical open standard for operations. Complex ideas are refined repeatedly, like water shaping a rock with each new question until what emerges is a defensible principle that is so conceptually solid it is almost impossible to deny (or if it is possible, takes a lot of work). Regardless of my stance -the belief that data teams should be treated like software engineers is an idea that has firmly taken hold of the data community and is growing with intensity and fervor with each passing day. This belief has resulted in the rise of an entirely new profession, management structure, and toolset. Career building + industry changing.
As the era of machine intelligence dawns upon us, we are already hearing a cry from prominent data science leaders that we need data-centric vs. model-centric AI. Similarly, as companies begin their transition to the cloud we see an explosion of tech debt ravage the Warehouse. In the face of a downturn, every data engineering manager’s number one priority will shift to ‘How do I manage this spend’ and ‘how can we function with our headcount cut by some horrifying percentage.’ While these are topics I regularly address on my Substack, LinkedIn, and within the digital content of others - there is not a clear view from the industry on what we should actually do when these situations arise.
I think this is a shame. While I’d like to believe the content I’ve produced has helped push practitioners toward meaningful solutions both organizationally and technically, there are other voices in this conversation worth hearing. The purpose of Data Quality Camp is to be a home to those voices, and hopefully (seriously, fingers crossed) the space becomes a driving force for developing and hardening the principles of applying Data Quality at scale. In the same way, my own career was uplifted by contributing to an ecosystem defining the future of the industry, I’d like to see the same for other vocal folks in the Data Quality profession.
So with that out of the way, here’s the pitch for DQC:
Data Quality Camp is all about driving practitioner-led conversations around the most challenging aspects of data quality at scale including data contracts, data architecture, data modeling, monitoring, CI/CD, and more.
Who is the community for: Data engineers, data platform teams, data scientists, analysts, software engineers, data product managers, and data executives solving data quality problems in a professional capacity
Who the community is not for: folks looking to make a transition to data science or data engineering at the entry-level, learn common languages like SQL or Python, or anyone otherwise not participatory in the data ecosystem (whether a producer or consumer)
What does the community entail:
Slack channel that currently features some of the people I enjoy learning from and talking to most in the industry like Joe Reis, Megan Lieu, Benjamin Rogojan, Greg Coquillo, Chris Riccomini, Mary MacCarthy, Lauren Balik, Mark Freeman, and Wendy Turner-Williams to name a few of the exceptional folks already present.
Data Quality Podcast: Landing each Tuesday with a community member.
Ask Me Anything: Featuring me and other community members on a weekly/bi-weekly cadence
Introductions: My goal is to expand your network, and that means making intros to the right people when you need it.
Offline Meetups: Coming soon!
Heavy moderation in effect from day 1: No vendor solicitation, lead generation, or harassment of any kind tolerated
Practitioner first: The goal of this community is not to function as an intro to data engineering, but to specifically focus on practitioners dealing with tough data quality challenges. We’ll keep it that way!
Getting Started Guide
Register by clicking the link below
You’ll get an automated message from me, asking for more details about your career goals and top DQ problems. This will help me understand what intros to make, what content to send your way, and what communities you should be a part of (please respond to this, it’s very helpful)
Fill out your Slack profile
Head to the #introduce_yourself channel and leave a comment
Make sure to sign up for the DQC Networking Platform, which is what we’ll use to connect you to other like-minded industry folks.
Shoot me a message! @chad on the channel. I’d love to hear from you.
For everyone following, thank you a ton for all the support thus far. It has been excellent growing this community alongside you, and I look forward to seeing you in the chat. Happy Friday!
Thanks for sharing your thought about the data-driven organisation.
Below is a link to a good article about making a data-driven organisation: https://medium.com/towards-polyglot-architecture/design-thinking-toward-data-driven-organisation-473060f44feb
Could you share your thoughts as well?