JOBS

🎁 Platform Ops, SRE, Site Reliability Engineer, Production Support Legend (AWS + Node, TS) ❄️🚀

1 week ago

This innovative SaaS platform is growing up quickly, and we need someone who loves calm thinking in chaotic moments, sharp debugging, and building the kind of operational muscle that makes a business unbreakable 🧠🔧

And yep, you can start basically as fast as Santa can deliver on Christmas Eve 🛷💨🎁

Job Category: SRE

Job Type: Full Time

Job Location: Sydney-Hybrid

Ever wanted a role where the product is scaling fast, the ambitions are global, and your work genuinely keeps the lights on for real humans around the world 🌏✨

This innovative SaaS platform is growing up quickly, and we need someone who loves calm thinking in chaotic moments, sharp debugging, and building the kind of operational muscle that makes a business unbreakable 🧠🔧

And yep, you can start basically as fast as Santa can deliver on Christmas Eve 🛷💨🎁

✅ Hybrid, 3 days a week in the Sydney office 🏙️🎄

You’ll be working with smart, ambitious, fun, engaged people who all know they’re building something bigger than just another SaaS 🚀✨

🎄 What you’ll be doing

Jumping into hands-on production support from day one, then steadily evolving the function into proper SRE practices and a true Site Reliability Engineer capability over time 🚀🧑 🚒
Acting as the main escalation point for production, owning incidents end to end, and keeping everyone aligned while you drive resolution 🚨✅
Helping design a real on-call and incident rhythm, this role will shape the playbook, and your SRE mindset will lift how the whole team runs 📘🛠️
Digging into data, logs, and platform behaviour, especially the deeper investigations that sit beyond typical customer support triage 🔍🧩
Debugging Node.js and TypeScript services, untangling API weirdness, and finding the real root cause 🕵️ ♂️⚙️
Living in AWS serverless land, event-driven systems, logs, dashboards, alerts, and the stuff nobody appreciates until it breaks ☁️⚙️
Troubleshooting problems that pop up in the mobile universe, iOS and Android, where “works on my phone” is not a strategy 📱😅
Partnering with the local engineering team in Sydney, and over time helping build a follow-the-sun approach, with eventual coverage across EU and US time zones 🌍✨
Helping the platform mature from strong production support into a more structured Site Reliability Engineer model, including better runbooks, stronger signals, and cleaner incident response 📘📈

🎅 What you’ll bring, plus bonus points if you’ve got it

You love troubleshooting, you’re the person who can take a messy, vague issue and calmly turn it into a clear root cause and a clean fix 🧩🕵️ ♂️
Confident navigating logs and data stores, and doing deeper platform investigations when the answer is not obvious 🔍🧠
Comfortable working with ambiguity, you don’t need a 12-step checklist to get moving, you create momentum, and you go find the answers 🎯⚡
Around 6+ years in a hands-on environment like platform operations, production support, reliability engineering, or technical support for complex systems 🧰
Strong at getting to the bottom of issues in Node.js and TypeScript, including API integrations and service to service weirdness 🔧🧠
Confident troubleshooting mobile ecosystem issues, iOS and Android behaviour, device quirks, app to backend interactions 📱⚙️
Comfortable jumping into cloud-first setups, especially modern serverless patterns, monitoring, storage, caching, and search type services ☁️🔍
You’ve been on the hook for serious incidents before, high severity, high impact, and you know how to respond calmly, communicate clearly, and restore service fast 🚨🧯
You write and speak well, and you can work with engineers, product, support, and leadership without anything getting lost in translation ✍️🤝
Familiarity with observability and monitoring tooling, whether that’s native cloud monitoring or broader platforms, you know what “good signals” look like 📈🔭
Experience supporting a global SaaS or consumer-facing product, where scale, latency, and reliability actually matter 🌍✨
Bonus if you’ve touched React or React Native enough to understand common architecture patterns and where problems tend to hide 🧑 💻📲
Helpful if you’ve wrestled with deployments and pipelines, CI/CD troubleshooting, release issues, rollback moments, the whole fun circus 🎪🚀
Handy if you can support some older tech stacks too, including Windows-based environments and some .NET-era realities 🪟🧓
Nice if you’ve dealt with regulated or safety-critical environments, or systems where the operational bar is higher than average 🛡️
Even better if you’ve helped lift a team’s capability, playbooks, processes, incident routines, on-call maturity, or operational discipline 🧱📘

🎄 The vibe

Fast growth, global realities, and a team that wants someone proactive, curious, and calm under pressure 🙃✅

If you like building stability while the rocket is already in the air, this one’s got your name on it 🚀🎅

📩 Email Keiran, keiran@bigwavedigital.com.au, or simply apply via the job advert 🎄🎅❄️🚀

JOBS

🎁 Platform Ops, SRE, Site Reliability Engineer, Production Support Legend (AWS + Node, TS) ❄️🚀

Apply for this position

Clients

About

Connect