🎁 Platform Ops, SRE, Site Reliability Engineer, Production Support Legend (AWS + Node, TS) β„οΈπŸš€

This innovative SaaS platform is growing up quickly, and we need someone who loves calm thinking in chaotic moments, sharp debugging, and building the kind of operational muscle that makes a business unbreakable πŸ§ πŸ”§

And yep, you can start basically as fast as Santa can deliver on Christmas Eve πŸ›·πŸ’¨πŸŽ

Job Category: SRE
Job Type: Full Time
Job Location: Sydney-Hybrid

Ever wanted a role where the product is scaling fast, the ambitions are global, and your work genuinely keeps the lights on for real humans around the world 🌏✨

This innovative SaaS platform is growing up quickly, and we need someone who loves calm thinking in chaotic moments, sharp debugging, and building the kind of operational muscle that makes a business unbreakable πŸ§ πŸ”§

And yep, you can start basically as fast as Santa can deliver on Christmas Eve πŸ›·πŸ’¨πŸŽ

βœ…Β Hybrid, 3 days a week in the Sydney officeΒ πŸ™οΈπŸŽ„

You’ll be working with smart, ambitious, fun, engaged people who all know they’re building something bigger than just another SaaS πŸš€βœ¨

πŸŽ„ What you’ll be doing

  • Jumping into hands-on production support from day one, then steadily evolving the function into properΒ SREΒ practices and a trueΒ Site Reliability EngineerΒ capability over time πŸš€πŸ§‘ πŸš’
  • Acting as the main escalation point for production, owning incidents end to end, and keeping everyone aligned while you drive resolution πŸš¨βœ…
  • Helping design a real on-call and incident rhythm, this role will shape the playbook, and yourΒ SREΒ mindset will lift how the whole team runs πŸ“˜πŸ› οΈ
  • Digging into data, logs, and platform behaviour, especially the deeper investigations that sit beyond typical customer support triage πŸ”πŸ§©
  • Debugging Node.js and TypeScript services, untangling API weirdness, and finding the real root cause πŸ•΅οΈ β™‚οΈβš™οΈ
  • Living in AWS serverless land, event-driven systems, logs, dashboards, alerts, and the stuff nobody appreciates until it breaks β˜οΈβš™οΈ
  • Troubleshooting problems that pop up in the mobile universe, iOS and Android, where β€œworks on my phone” is not a strategy πŸ“±πŸ˜…
  • Partnering with the local engineering team in Sydney, and over time helping build a follow-the-sun approach, with eventual coverage across EU and US time zones 🌍✨
  • Helping the platform mature from strong production support into a more structuredΒ Site Reliability EngineerΒ model, including better runbooks, stronger signals, and cleaner incident response πŸ“˜πŸ“ˆ

πŸŽ… What you’ll bring, plus bonus points if you’ve got it

  • You love troubleshooting, you’re the person who can take a messy, vague issue and calmly turn it into a clear root cause and a clean fix πŸ§©πŸ•΅οΈ ♂️
  • Confident navigating logs and data stores, and doing deeper platform investigations when the answer is not obvious πŸ”πŸ§ 
  • Comfortable working with ambiguity, you don’t need a 12-step checklist to get moving, you create momentum, and you go find the answers 🎯⚑
  • Around 6+ years in a hands-on environment like platform operations, production support, reliability engineering, or technical support for complex systems 🧰
  • Strong at getting to the bottom of issues in Node.js and TypeScript, including API integrations and service to service weirdness πŸ”§πŸ§ 
  • Confident troubleshooting mobile ecosystem issues, iOS and Android behaviour, device quirks, app to backend interactions πŸ“±βš™οΈ
  • Comfortable jumping into cloud-first setups, especially modern serverless patterns, monitoring, storage, caching, and search type services β˜οΈπŸ”
  • You’ve been on the hook for serious incidents before, high severity, high impact, and you know how to respond calmly, communicate clearly, and restore service fast 🚨🧯
  • You write and speak well, and you can work with engineers, product, support, and leadership without anything getting lost in translation ✍️🀝
  • Familiarity with observability and monitoring tooling, whether that’s native cloud monitoring or broader platforms, you know what β€œgood signals” look like πŸ“ˆπŸ”­
  • Experience supporting a global SaaS or consumer-facing product, where scale, latency, and reliability actually matter 🌍✨
  • Bonus if you’ve touched React or React Native enough to understand common architecture patterns and where problems tend to hide πŸ§‘ πŸ’»πŸ“²
  • Helpful if you’ve wrestled with deployments and pipelines, CI/CD troubleshooting, release issues, rollback moments, the whole fun circus πŸŽͺπŸš€
  • Handy if you can support some older tech stacks too, including Windows-based environments and some .NET-era realities πŸͺŸπŸ§“
  • Nice if you’ve dealt with regulated or safety-critical environments, or systems where the operational bar is higher than average πŸ›‘οΈ
  • Even better if you’ve helped lift a team’s capability, playbooks, processes, incident routines, on-call maturity, or operational discipline πŸ§±πŸ“˜

πŸŽ„ The vibe

Fast growth, global realities, and a team that wants someone proactive, curious, and calm under pressure πŸ™ƒβœ…

If you like building stability while the rocket is already in the air, this one’s got your name on it πŸš€πŸŽ…

πŸ“© Email Keiran,Β keiran@bigwavedigital.com.au, or simply apply via the job advert πŸŽ„πŸŽ…β„οΈπŸš€

Apply for this position

Allowed Type(s): .pdf, .doc, .docx