At Transposit, we are committed to unlocking the power of APIs. We work in an entrepreneurial, creative, and collaborative environment. Our team is open, driven, inclusive and honest; every employee makes a big impact. Based in San Francisco, the company is well funded by top investors and led by leadership with successful track records. If you are passionate about pushing your own limits and working closely with others to define the future, you’ll find yourself at home at Transposit.
Transposit operates a 24/7 service with all technical staff participating in its development, operation, and maintenance. The SRE team takes a leadership role in designing and constructing all aspects of the service for high reliability, resiliency, and scale. This position has a heavy emphasis on automation, autonomy, and vision.
The Principal Site Reliability Engineer will be responsible for:
- Writing automation software for provisioning and operating Transposit services and infrastructure at scale.
- Designing and enhancing software architecture to improve efficiency, scalability, and reliability.
- Collaborating closely with the product development team from inception through deployment and maintenance.
- Designing, building, maintaining, and scaling production services.
- Minimum of 6+ years of Unix/Linux experience (familiarity with tools, kernel and networking).
- Strong development and automation skills.
- Strong debugging skills (experience with Java debugging is a plus!).
- Extensive experience with CI/CD pipelines and Infrastructure as Code (Terraform, CloudFormation, etc).
- Extensive experience with a variety of AWS services (e.g. ECS, Lambda, S3, EC2, RDS, EFS, ALB, VPC)
- Automation/tools-first mindset – building tools that increase efficiency and make mundane, difficult or repetitive tasks easy and quick to do!
- Experience building cloud infrastructure that enables reliable and rapid deployment of microservices with effective monitoring and resilient operations.