Data Science Engineer
Spirit AI is building a tool to empower community managers in novel ways – an Ally to assist in nourishing communities to their fullest. Ally uses the latest advancements in machine learning to boil down the millions of lines of text communication in a community to the interactions that help and hurt the growth of the community.
We're seeking an experienced data science engineer with an interest in putting data science models in production as well as in building big data pipelines. You're ideally someone interested in both data science concerns and data engineering at scale, with ability to talk across disciplines and help multiple teams with setting up and monitoring/testing data intensive applications. In general, you would help by partnering with internal stakeholders to develop tooling related to streaming data, putting NLP deep learning and other models in production, defining and deploying APIs, and providing leadership in data product software engineering best-practices. You're probably more interested in data engineering and pipelines than you are in day-in, day-out model tweaking.
- Build frameworks tackling meta-problems such as model deployment, data validation, and pipeline orchestration. Familiarity with tools like Airflow and Kafka a major plus.
- Facilitate self-service access to data and building of data products for experts in data science as well as related product engineering teams.
- Assist in defining best-practices for engineering with machine learning models (testing, performance, API development and mocking).
- Deploying our codebase to cloud service production with docker/containers.
Python (including scikit-learn), SQL, Mongo, Kafka, Git.
Scala, Node.js, Cloud Infrastructure (AWS/GCP), Airflow, Docker, Redis, BigQuery, Solr, Tensorflow, Scikit-Learn and SpaCy, Spark.
- Ideally a couple of years minimum building data pipelines for data science in big data organizations.
- Experience with moving data science models into production and testing/monitoring them.
- TDD/BDD experience
- Experience with streaming data a plus.
Ideally you will be in the Boston area, or possibly somewhere else on the East Coast. Less ideally, you are located around London, or the general CET time zone.
As a Polyglot team of software engineers and data scientists, we choose the technology to fit the problem. We are an open and friendly group, where mistakes are treated as an opportunity to learn. We work in a continuous integration environment with a peer review process to ensure that things move smoothly into production.
Interested? Email us your cv