Here at Hoodline we are committed to creating informative, high-quality, and engaging content from a neighborhood-level perspective. We believe that many of the most exciting and important news stories, local stories that deepen readers’ connection to their own communities, today remain untold. We see an opportunity to fill this gap using data.
To do this we’re building a team of data scientists who can query, scrape, transform, and analyze data for newsworthy insights and help turn them into articles that not only are fun to read but also provide an essential service to anyone who wants to better understand a neighborhood. Through our Content API, we make these stories available to distributors, reaching a nationwide audience across a range of different sites and apps. Our growing list of partners includes Yelp, Eventbrite, Uber, Zumper, ABC television, McClatchy, Advance Digital, TripAdvisor and Vice.
Our small team is growing quickly, and we are looking for someone brilliant, enthusiastic, and experienced to take the lead. Come help us tell the stories of the world’s neighborhoods!
About the Position
Your primary responsibilities will be overseeing the technical development of our content automation pipeline, through which we turn insights from data into human-readable articles with minimal or no involvement from a human editor. You will work closely with our product, engineering, and editorial teams to deliver the highest-quality content quickly and at scale. A strong foundation in statistics, machine learning, Python, and SQL are necessary. A solid understanding of NLP techniques including TFIDF, Word2Vec, GloVe and associated modeling techniques is also required. An ideal candidate should have some familiarity with using RNNs and/or LSTM for natural language generation.
Our team has years of success in consumer tech, online and local media. We’re backed by a range of top tech investors including Rakuten, Greylock Partners, Social Capital, Graph Ventures, Charles River Ventures, Eric Schmidt's Innovation Endeavors, Pear Ventures, Matter Ventures, John S. & James L. Knight Foundation, 500 Startups, and SoftTech VC. Angel investors include Joi Ito, Director of the MIT Media Lab, Cyan and Scott Banister, Ben Silbermann of Pinterest and Shane Smith of VICE.
As an early member of our Data Science team, you will have an outsized impact on future of Hoodline’s platform. We’re building and growing—which means you’ll be able to leave your fingerprints all over the product while working with a world-class team to tackle one of tech’s largest and most challenging problems: local.
If you’re curious, driven, and enjoy a challenge, this could be the right home for you. Join us!
- Leveraging our large amount of structured and unstructured data to extract newsworthy insights
- Developing algorithms and modeling techniques to expose conclusively how communities work at a neighborhood level
- Develop metrics and evaluate the performance of stories by designing experiments to drive product requirements
- Directing and providing technical support to data scientists and engineers working on content automation
- Growing and mentoring the team
- Keeping up with recent advances in natural language processing, machine-learning, and big data processing
- MSc (PhD preferred) in hard sciences or computer science, natural language processing, machine learning
- Extensive experience working with various datasets, familiarity with common schemas, and an understanding of both the power and shortcomings of data
- Experience mentoring and building a technical team
- Experience in large scale software development and product development process (code design, test plan and code reviews)
- Fluent in Python programming including common data science tools such as Jupyter notebooks, scikit-learn, pandas, and matplotlib
- Fluent in SQL and ability to write queries for large datasets
- Experience with commonly available tools and infrastructures for natural language processing, text mining, machine learning and parallel data processing
- Experience working with large amounts of user generated content and process data in large-scale environments using Amazon EC2, Storm, Hadoop and Spark
- 5+ years of professional experience in the field
- Passionate about data driven products
- Passionate about uncovering stories in the local news space
- Advanced natural language generation techniques
- Is an active participant in meet-ups / groups related to: Data Analytics, Hadoop, Cloud Computing, Data Visualization, Data Mining, MapReduce, Machine Learning, High Scalability Computing, Predictive Analytics